x-zheng16/Awesome-Embodied-AI-Safety
Safety in Embodied AI: A Survey of Risks, Attacks, and Defenses | 400+ Papers | Perception, Cognition, Planning, Interaction, Agentic System
Projects and tools focused on ensuring the safe deployment and use of AI.
Safety in Embodied AI: A Survey of Risks, Attacks, and Defenses | 400+ Papers | Perception, Cognition, Planning, Interaction, Agentic System
AI Agent Governance Toolkit — Policy enforcement, zero-trust identity, execution sandboxing, and reliability engineering for autonomous AI agents. Covers 10/10 OWASP Agentic Top 10.
An orchestration runtime for multi-agent AI systems. Declare agents, tools, and policies as YAML; Orloj schedules, executes, routes, and governs them for production-grade operation.
几百个免费 AI 模型配额,一键接入本地项目。| Hundreds of free AI model quotas, one-click access to local projects.
The Execution Security Layer for the Agentic Era. Providing deterministic "Sudo" governance and audit logs for autonomous AI agents.
Persistent Claude Code agents with scheduling, sessions, memory, and Telegram.
AISecOps (AI Security Operations) framework for deterministic verification of AI systems. QWED verifies LLM outputs using math, logic, and symbolic execution — creating an auditable trust boundary for agentic AI systems. Not generation. Verification.
Internal Safety Collapse: Turning the LLM or an AI Agent into a sensitive data generator.
One API for 20+ LLM providers, your databases, and your files — self-hosted, open-source AI gateway with RAG, voice, and guardrails.
[ICLR'24 Spotlight] A language model (LM)-based emulation framework for identifying the risks of LM agents with tool use
Centralized agent control plane for governing runtime agent behavior at scale. Configurable, extensible, and production-ready.
mkdir beats vector DB. B-tree NeuronFS: 0-byte folders govern AI — ₩0 infrastructure, ~200x token efficiency. OS-native constraint engine for LLM agents.
A curated list of awesome responsible machine learning resources.
Safe RLHF: Constrained Value Alignment via Safe Reinforcement Learning from Human Feedback
UQLM: Uncertainty Quantification for Language Models, is a Python package for UQ-based LLM hallucination detection
👾 下一代透明智能体架构 | Next-Gen Transparent Agent Architecture 🔍 全行为审计 | 🛡️ 两段式安全调用 | 🧠 双水位记忆 | ⏰ 心跳任务 📊 P0 级事故率降低 80% | 兼容 OpenClaw + Claude Code 技能生态
Secrets of RLHF in Large Language Models Part I: PPO
Deliver safe & effective language models
The open agent control plane. Govern autonomous AI agents with pre-execution policy enforcement, approval gates, and audit trails. Works with LangChain, CrewAI, MCP, and any framework.
Open Source LLM toolkit to build trustworthy LLM applications. TigerArmor (AI safety), TigerRAG (embedding, RAG), TigerTune (fine-tuning)
🦀 Prevents outdated Rust code suggestions from AI assistants. This MCP server fetches current crate docs, uses embeddings/LLMs, and provides accurate context via a tool call.
An unrestricted attack based on diffusion models that can achieve both good transferability and imperceptibility.
【ACL 2026 Main】AgentMark: Utility-Preserving Behavioral Watermarking for Agents
AI Agent Security Middleware — 8-layer defense, DLP data flow, prompt injection detection, zero dependencies. SDK + OpenClaw plugin.