huggingface/transformers
🤗 Transformers: the model-definition framework for state-of-the-art machine learning models in text, vision, audio, and multimodal models, for both inference and training.
Projects and tools related to deep learning algorithms and neural networks.
🤗 Transformers: the model-definition framework for state-of-the-art machine learning models in text, vision, audio, and multimodal models, for both inference and training.
🤗 The largest hub of ready-to-use datasets for AI models with fast, easy-to-use and efficient data manipulation tools
Ray is an AI compute engine. Ray consists of a core distributed runtime and a set of AI Libraries for accelerating ML workloads.
Streamlit — A faster way to build and share data apps.
An Open Source Machine Learning Framework for Everyone
Open standard for machine learning interoperability
Build and share delightful machine learning apps, all in Python. 🌟 Star to support our work!
A game theoretic approach to explain the output of any machine learning model.
Ultralytics YOLO 🚀
Pretrain, finetune ANY AI model of ANY size on 1 or 10,000+ GPUs with zero code changes.
Cross-platform, customizable ML solutions for live and streaming media.
Ready-to-use OCR with 80+ supported languages and all popular writing scripts including Latin, Chinese, Arabic, Devanagari, Cyrillic and etc.
A Lightweight Face Recognition and Facial Attribute Analysis (Age, Gender, Emotion and Race) Library for Python
🔥 An autonomous AI agent that runs your deep learning experiments 24/7 while you sleep. Zero-cost monitoring, Leader-Worker architecture, constant-size memory.
Tensors and Dynamic neural networks in Python with strong GPU acceleration
🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production
ncnn is a high-performance neural network inference framework optimized for the mobile platform
PArallel Distributed Deep LEarning: Machine Learning Framework from Industrial Practice (『飞桨』核心框架,深度学习&机器学习高性能单机、分布式训练和跨平台部署)
Label Studio is a multi-type data labeling and annotation tool with standardized output format
The fastai deep learning library
CLIP (Contrastive Language-Image Pretraining), Predict the most relevant text snippet given an image
LabelImg is now part of the Label Studio community. The popular image annotation tool created by Tzutalin is no longer actively being developed, but you can check out Label Studio, the open source data labeling tool for images, text, hypertext, audio, video and time-series data.
Deezer source separation library including pretrained models.
GFPGAN aims at developing Practical Algorithms for Real-world Face Restoration.
Qlib is an AI-oriented Quant investment platform that aims to use AI tech to empower Quant Research, from exploring ideas to implementing productions. Qlib supports diverse ML modeling paradigms, including supervised learning, market dynamics modeling, and RL, and is now equipped with https://github.com/microsoft/RD-Agent to automate R&D process.
Open source simulator for autonomous vehicles built on Unreal Engine / Unity, from Microsoft AI & Research
Making large AI models cheaper, faster and more accessible
Caffe: a fast open framework for deep learning.
Graph Neural Network Library for PyTorch
Implement a ChatGPT-like LLM in PyTorch from scratch, step by step
AiLearning:数据分析+机器学习实战+线性代数+PyTorch+NLTK+TF2
《动手学深度学习》:面向中文读者、能运行、可讨论。中英文版被70多个国家的500多所大学用于教学。
JARVIS, a system to connect LLMs with ML community. Paper: https://arxiv.org/pdf/2303.17580.pdf
VoxCPM2: Tokenizer-Free TTS for Multilingual Speech Generation, Creative Voice Design, and True-to-Life Cloning
Multilingual neural TTS (6 languages: JA/EN/ZH/ES/FR/PT, code supports SV) — C++, C#, Rust, Go, Python, npm (WASM). VITS + Prosody, streaming, CUDA/CoreML/DirectML. pip install piper-plus | npm install piper-plus | cargo install piper-plus-cli
A machine learning primer built from first principles. For engineers who want to reason about ML systems the way they reason about software systems.
This repository is a curated collection of hands-on data science projects tailored for beginners. Whether you're just starting your journey in data science or looking to strengthen your skills, these projects provide a practical and interactive way to apply your knowledge.
A complete GPT language model (training and inference) in ~600 lines of pure C#, zero dependencies
Design a custom AI inference chip. That is the goal.
Next Generation Machine Learning, Statistics and Deep Learning in PURE Rust
从 NLP 到 LLM 的算法全栈教程,在线阅读地址:https://datawhalechina.github.io/base-llm/
Crater is a cloud-native AI training & inference platform.
ARIS ⚔️ (Auto-Research-In-Sleep) — Lightweight Markdown-only skills for autonomous ML research: cross-model review loops, idea discovery, and experiment automation. No framework, no lock-in — works with Claude Code, Codex, OpenClaw, or any LLM agent.
ComfyUI custom nodes for Seedance 2.0 video generation — text-to-video, image-to-video, and video extend via muapi.ai
Always up-to-date, most comprehensive HAR resource — continuously scanned and auto-updated from Papers with Code. 53 datasets integrated across all modalities.
A universal AI toolkit for high-performance Speech-to-Text (STT) and Text-to-Speech (TTS) processing, designed for low-latency and easy model integration.
Free self-driving car stack - fully open-source ADAS and autonomous driving system
Reliable, minimal and scalable library for evaluating and conducting world model research
📚这个仓库是在arxiv上收集的有关VLN,VLA,World Model,SLAM,Gaussian Splatting,非线性优化等相关论文。每天都会自动更新!issue区域是最新10篇论文
Your Cheat Sheet for Machine Learning Interview – Questions and Answers.
本项目旨在为致力于进入VLA(Vision-Language-Action)领域的算法工程师提供一份全中文、实战导向的学习/面试手册。 不同于通用的 CV/NLP 面试指南,本项目聚焦于 Robotics 特有的挑战
计算机毕业设计、机器学习毕业设计、深度学习毕业设计、原创AI项目【源码+论文】
ML-powered manga translator, written in Rust.
Embed trained machine learning predictors into JuMP and ExaModels
Production-ready ML framework for Go with zero dependencies. Train and deploy neural networks as single binaries. PyTorch-like API, type-safe tensors, automatic differentiation.
程序员鱼皮的 AI 资源大全 + Vibe Coding 零基础教程,分享 OpenClaw 保姆级教程、大模型玩法(DeepSeek / GPT / Gemini / Claude)、最新 AI 资讯、Prompt 提示词大全、AI 知识百科(Agent Skills / RAG / MCP / A2A)、AI 编程教程、AI 工具用法(Cursor / Claude Code / TRAE / Lovable / Copilot)、AI 开发框架教程(Spring AI / LangChain)、AI 产品变现指南,帮你快速掌握 AI 技术,走在时代前沿。本项目为开源文档,已升级为鱼皮 AI 导航网站
Fine-tune LLMs on your Mac with Apple Silicon. SFT, DPO, GRPO, and Vision fine-tuning — natively on MLX. Unsloth-compatible API.
A list of summer schools on Artificial Intelligence, Machine Learning, and Healthcare
Complete deep learning project developed in Full Stack Deep Learning, 2022 edition. Generated automatically from https://github.com/full-stack-deep-learning/fsdl-text-recognizer-2022
dddmr_navigation is the 3D navigation solution for mobile robots includes mapping/localization/perception/path planning/controller/move base
Library for Jacobian descent with PyTorch. It enables the optimization of neural networks with multiple losses (e.g. multi-task learning).
A Simple, Lightweight, and Extensible Serving Framework for X-AnyLabeling
High-performance, optimized pre-trained template AI application pipelines for systems using Hailo devices
Reliable, minimal and scalable library for pretraining foundation and world models
An open, large-scale, interactive textbook.
Kanade is a single-layer disentangled speech tokenizer that extracts compact tokens suitable for both generative and discriminative modeling.
DashAI: an interactive platform for training, evaluating and deploying AI models
A Python toolbox for deep image reconstruction, with emphasis on single-pixel imaging.
A repo documenting EEG-Dash data and its usage
CREsted is a Python package for training sequence-based deep learning models on scATAC-seq data, for capturing enhancer code and for designing cell type-specific sequences.
Complete mathematics curriculum for AI/ML/LLM - from foundations to research frontiers
zea: A Toolbox for Cognitive Ultrasound Imaging
Qdrant - High-performance, massive-scale Vector Database and Vector Search Engine for the next generation of AI. Also available in the cloud https://cloud.qdrant.io/
✔(已完结)超级全面的 深度学习 笔记【土堆 Pytorch】【李沐 动手学深度学习】【吴恩达 深度学习】【大飞 大模型Agent】
Nano vLLM
Effortless data labeling with AI support from Segment Anything and other awesome models.
All course materials for the Zero to Mastery Machine Learning and Data Science course.
🏝️ OASIS: Open Agent Social Interaction Simulations with One Million Agents.
Implement a reasoning LLM in PyTorch from scratch, step by step
🌟100+ 原创 LLM / RL 原理图📚,《大模型算法》作者巨献!💥(100+ LLM/RL Algorithm Maps )
⚡️SwanLab - an open-source, modern-design AI training tracking and visualization tool. Supports Cloud / Self-hosted use. Integrated with PyTorch / Transformers / verl / LLaMA Factory / ms-swift / Ultralytics / MMEngine / Keras etc.
【三年面试五年模拟】AIGC算法工程师面试秘籍。涵盖AIGC、LLM大模型、AI Agent、传统深度学习、自动驾驶、机器学习、计算机视觉、自然语言处理、强化学习、大数据挖掘、具身智能、元宇宙、AGI等AI行业面试笔试干货经验与核心知识。
Become a cracked AI/ML Research Engineer
GeoAI: Artificial Intelligence for Geospatial Data
Implementation of papers in 100 lines of code.
Open-Source AI Camera Skills Platform, AI NVR & CCTV Surveillance. Local VLM video analysis with Qwen, DeepSeek, SmolVLM, LLaVA, YOLO26. LLM-powered agentic security camera agent — watches, understands, remembers & guards your home via Telegram, Discord or Slack. Pluggable AI skills. OpenAI, Google, Anthropic or local AI. Runs on Mac Mini & AI PC.
Open-source deep-learning framework for building, training, and fine-tuning deep learning models using state-of-the-art Physics-ML methods
The official repo for [NeurIPS'22] "ViTPose: Simple Vision Transformer Baselines for Human Pose Estimation" and [TPAMI'23] "ViTPose++: Vision Transformer for Generic Body Pose Estimation"
A Deep Learning Python Toolkit for Healthcare Applications.
A comprehensive list of papers for the definition of World Models and using World Models for General Video Generation, Embodied AI, and Autonomous Driving, including papers, codes, and related websites.
Vietnamese TTS with instant voice cloning • On-device • Real-time CPU inference • 24kHz audio quality • Chuyển văn bản thành giọng nói tiếng Việt • Text to speech tiếng Việt • TTS tiếng Việt
Automatically find issues in image datasets and practice data-centric computer vision.
[ECCV`24&ICLR`25] CityGaussian Series for High-quality Large-Scale Scene Reconstruction with Gaussians
ML algorithms implemented and derived from first-principles in Jupyter Notebooks and NumPy
Open-source AI-driven quantitative trading platform for crypto, stocks, and forex with backtesting, live trading, market data, and multi-agent research.
UNetFormer: A UNet-like transformer for efficient semantic segmentation of remote sensing urban scene imagery, ISPRS. Also, including other vision transformers and CNNs for satellite, aerial image and UAV image segmentation.
A Lighting Pytorch Framework for Recommendation Models, Easy-to-use and Easy-to-extend.
Official PyTorch implementation of Superpoint Transformer introduced in [ICCV'23] "Efficient 3D Semantic Segmentation with Superpoint Transformer" and SuperCluster introduced in [3DV'24 Oral] "Scalable 3D Panoptic Segmentation As Superpoint Graph Clustering"
CALVIN - A benchmark for Language-Conditioned Policy Learning for Long-Horizon Robot Manipulation Tasks
A Change Detection Repo Standing on the Shoulders of Giants