A Minimal, Self-Evolving Autonomous Agent Framework
~3K lines of seed code · 9 atomic tools · ~100-line Agent Loop
📌 Official Channel — This GitHub repository is the only official source of GenericAgent. We have no affiliation with any third-party website using the GenericAgent name.
GenericAgent is a minimal, self-evolving autonomous agent framework. Its core is just ~3K lines of code. Through 9 atomic tools + a ~100-line Agent Loop, it grants any LLM system-level control over a local computer — covering browser, terminal, filesystem, keyboard/mouse input, screen vision, and mobile devices (ADB).
Design philosophy — don't preload skills, evolve them.
Every time GenericAgent solves a new task, it automatically crystallizes the execution path into a reusable Skill. The longer you use it, the more skills accumulate — forming a personal skill tree grown entirely from 3K lines of seed code.
🤖 Self-Bootstrap Proof — Everything in this repository, from installing Git and running
git initto every commit message, was completed autonomously by GenericAgent. The author never opened a terminal once.
- Key Features
- Demo Showcase
- Quick Start
- Usage
- Architecture
- Self-Evolution Mechanism
- Comparison
- Evaluation
- Roadmap & News
- Community & Support
- License
| Feature | Description |
|---|---|
| 🧬 Self-Evolving | Automatically crystallizes each task into a Skill. Capabilities grow with every use, forming your personal skill tree. |
| 🪶 Minimal Architecture | ~3K lines of core code. Agent Loop is ~100 lines. No complex dependencies, zero deployment overhead. |
| ⚡ Strong Execution | Injects into a real browser (preserving login sessions). 9 atomic tools take direct control of the system. |
| 🔌 High Compatibility | Supports Claude / Gemini / Kimi / MiniMax and other major models. Cross-platform. |
| 💰 Token Efficient | <30K context window — a fraction of the 200K–1M other agents consume. Less noise, fewer hallucinations, higher success rate, lower cost. |
⚠️ Python version: use Python 3.11 or 3.12. Do not use Python 3.14 — it is incompatible withpywebviewand a few other GA dependencies.📖 Detailed installation guide: installation.md · installation_zh.md(中文)
Fetch the installation guide and follow it:
curl -fsSL https://raw.githubusercontent.com/lsdefine/GenericAgent/refs/heads/main/docs/installation.mdThis installs GenericAgent with an isolated Python environment and Git, then downloads a ready-to-run package.
Windows PowerShell
powershell -ExecutionPolicy Bypass -c "$env:GLOBAL=1; irm http://fudankw.cn:9000/files/ga_install.ps1 | iex"Linux / macOS
GLOBAL=1 bash -c "$(curl -fsSL http://fudankw.cn:9000/files/ga_install.sh)"After installation, launch the desktop app from:
frontends/GenericAgent.exe
git clone https://github.com/lsdefine/GenericAgent.git
cd GenericAgent
uv venv
uv pip install -e ".[ui]" # Core + UI dependencies
cp mykey_template.py mykey.py # Fill in your LLM API key
python launch.pyw💡 GenericAgent is meant to grow its environment through the Agent itself, not by pre-installing every possible package.
📖 Full guide: docs/GETTING_STARTED.md
For one-line installs on Windows, double-click:
frontends/GenericAgent.exe
A lightweight, keyboard-driven interface built on Textual. Supports multiple concurrent sessions and real-time streaming.
python frontends/tuiapp_v2.py⚠️ Windows TUI Troubleshooting
TUI rendering on Windows can be flaky depending on terminal + font. Common causes:
textualis not on the latest version —pip install -U textualfirst.- PowerShell / cmd ship with terminals that have rough Unicode + key-binding support. Prefer Git Bash on Windows, which is much better behaved.
- If it still looks broken, ask GA itself to fix it:
"My experience using
frontends/tuiapp_v2.pyin PowerShell / cmd / Git Bash on Windows is very poor — lots of incompatibility. Please refer to Claude Code's best practices for the Windows terminal and fix all font and rendering incompatibilities."
python launch.pywGenericAgent also supports IM frontends such as Telegram, WeChat, QQ, Feishu / Lark, WeCom, and DingTalk.
| Platform | Command |
|---|---|
| Telegram | python frontends/tgapp.py |
python frontends/wechatapp.py |
|
python frontends/qqapp.py |
|
| Feishu / Lark | python frontends/fsapp.py |
| WeCom | python frontends/wecomapp.py |
| DingTalk | python frontends/dingtalkapp.py |
For detailed setup, ask GenericAgent itself.
| Command | Description |
|---|---|
/new |
Start a fresh conversation and clear the current context |
/continue |
List recoverable conversation snapshots |
/continue N |
Restore the N-th recoverable conversation |
GenericAgent accomplishes complex tasks through Layered Memory × Minimal Toolset × Autonomous Execution Loop, continuously accumulating experience during execution.
Memory crystallizes throughout task execution, letting the agent build stable, efficient working patterns over time.
| Layer | Name | Description |
|---|---|---|
| L0 | Meta Rules | Core behavioral rules and system constraints |
| L1 | Insight Index | Minimal memory index for fast routing and recall |
| L2 | Global Facts | Stable knowledge accumulated over long-term operation |
| L3 | Task Skills / SOPs | Reusable workflows for completing specific task types |
| L4 | Session Archive | Archived task records distilled from finished sessions for long-horizon recall |
Perceive environment state → Task reasoning → Execute tools → Write experience to memory → Loop
The entire core loop is just ~100 lines of code (agent_loop.py).
GenericAgent provides only 9 atomic tools, forming the foundational capabilities for interacting with the outside world.
| Tool | Function |
|---|---|
code_run |
Execute arbitrary code (Python / PowerShell) |
file_read |
Read files |
file_write |
Write / create / overwrite files |
file_patch |
Patch / modify files |
web_scan |
Perceive web content |
web_execute_js |
Control browser behavior |
ask_user |
Human-in-the-loop confirmation |
update_working_checkpoint |
(memory) Short-term working notepad |
start_long_term_update |
(memory) Distill long-term memory |
Capable of dynamically creating new tools.
Via code_run, GenericAgent can dynamically install Python packages, write new scripts, call external APIs, or control hardware at runtime — crystallizing temporary abilities into permanent tools.
This is what fundamentally distinguishes GenericAgent from every other agent framework.
[New Task]
│
▼
[Autonomous Exploration] ─► install deps · write scripts · debug · verify
│
▼
[Crystallize into Skill] ─► write to memory layer
│
▼
[Direct Recall on Next Similar Task]
| What you say | First time | Every time after |
|---|---|---|
| "Read my WeChat messages" | Install deps → reverse DB → write read script → save Skill | one-line invoke |
| "Monitor stocks and alert me" | Install mootdx → build selection flow → configure cron → save Skill |
one-line start |
| "Send this file via Gmail" | Configure OAuth → write send script → save Skill | ready to use |
After a few weeks, your agent instance will have a skill tree no one else in the world has — all grown from 3K lines of seed code.
| Feature | GenericAgent | OpenClaw | Claude Code |
|---|---|---|---|
| Codebase | ~3K lines | ~530,000 lines | Open-sourced (large) |
| Deployment | pip install + API Key |
Multi-service orchestration | CLI + subscription |
| Browser Control | Real browser (session preserved) | Sandbox / headless browser | Via MCP plugin |
| OS Control | Mouse/kbd, vision, ADB | Multi-agent delegation | File + terminal |
| Self-Evolution | Autonomous skill growth | Plugin ecosystem | Stateless between sessions |
| Out of the Box | Few core files + starter skills | Hundreds of modules | Rich CLI toolset |
📂 Full evaluation datasets and results: JinyiHan99/GA-Technical-Report
We evaluate GenericAgent across five dimensions:
| # | Dimension | Question | Benchmarks |
|---|---|---|---|
| 1 | Task Completion & Token Efficiency | Can GA complete hard tasks more cheaply than leading agents? | SOP-Bench, Lifelong AgentBench, RealFin-Benchmark |
| 2 | Tool-Use Efficiency | Can a minimal atomic toolset solve what specialized toolsets solve, with less overhead? | Tool Efficiency Benchmark (11 simple + 5 long-horizon) |
| 3 | Memory System Effectiveness | Does condensed hierarchical memory beat full/redundant memory and embedding-based retrievers? | SOP-Bench (dangerous goods), LoCoMo, 20-skill stress test |
| 4 | Self-Evolution Capability | Can the agent distill experience into reusable SOPs and code, without intervention? | 9-round LangChain longitudinal study, 8-task cross-task web benchmark |
| 5 | Web Browsing Capability | Does density-driven design survive the open web? | WebCanvas, BrowseComp-ZH, Custom Tasks (22) |
Baselines across these dimensions include Claude Code, OpenAI CodeX, and OpenClaw, evaluated under Claude Sonnet 4.6, Claude Opus 4.6, GPT-5.4, and MiniMax M2.7 backbones.
- 2026-05-15 — 🖥️ Desktop GUI released. One-line installs ship a ready-to-run desktop app (
frontends/GenericAgent.exe). Developers launch viapython launch.pyw. - 2026-05-14 — 🆕 Conductor sub-agent orchestration. Spawn, supervise, and auto-clean parallel sub-agents; first-class delegation primitives complementing
/btwside-questions. - 2026-05-12 — 🆕 TUI v2 released (
frontends/tuiapp_v2.py). Refined Textual frontend with image-paste folding, file paste, block-delete, Ctrl+C copy, history navigation, and/llm//export//continuepickers. - 2026-04-21 — 📄 Technical Report on arXiv — GenericAgent: A Token-Efficient Self-Evolving LLM Agent via Contextual Information Density Maximization.
- 2026-04-11 — Introduced L4 session archive memory and scheduler cron integration.
- 2026-03-23 — Personal WeChat supported as a bot frontend.
- 2026-03-10 — Released million-scale Skill Library.
- 2026-03-08 — Released "Dintal Claw" — a GenericAgent-powered government-affairs bot.
- 2026-03-01 — Featured by Jiqizhixin (机器之心).
- 2026-01-16 — GenericAgent V1.0 public release.
If this project helped you, please consider leaving a Star! 🙏
You're also welcome to join the GenericAgent Community Group for discussion, feedback, and co-building 👏
Thanks to the LinuxDo community for the support!
Community GUIs (independent open-source projects):
Distributed under the MIT License. See LICENSE for full text.
Disclaimer: This project does not build or operate any commercial website. Apart from DintalClaw, no institution, organization, or individual is currently officially authorized to conduct commercial activities under the GenericAgent name.
GenericAgent 是一个极简、可自我进化的自主 Agent 框架。核心仅 ~3K 行代码,通过 9 个原子工具 + ~100 行 Agent Loop,赋予任意 LLM 对本地计算机的系统级控制能力,覆盖浏览器、终端、文件系统、键鼠输入、屏幕视觉及移动设备(ADB)。
设计哲学 —— 不预设技能,靠进化获得能力。
每解决一个新任务,GenericAgent 就将执行路径自动固化为 Skill,供后续直接调用。使用时间越长,沉淀的技能越多,形成一棵完全属于你、从 3K 行种子代码生长出来的专属技能树。
🤖 自举实证 — 本仓库的一切,从安装 Git、
git init到每一条 commit message,均由 GenericAgent 自主完成。作者全程未打开过一次终端。
| 特性 | 说明 |
|---|---|
| 🧬 自我进化 | 每次任务自动沉淀 Skill,能力随使用持续增长,形成专属技能树 |
| 🪶 极简架构 | ~3K 行核心代码,Agent Loop 约百行,无复杂依赖,部署零负担 |
| ⚡ 强执行力 | 注入真实浏览器(保留登录态),9 个原子工具直接接管系统 |
| 🔌 高兼容性 | 支持 Claude / Gemini / Kimi / MiniMax 等主流模型,跨平台运行 |
| 💰 极致省 Token | 上下文窗口不到 30K,是其他 Agent(200K–1M)的零头;噪声更少、幻觉更低、成功率更高,成本低一个数量级 |
⚠️ Python 版本: 推荐使用 Python 3.11 或 3.12。请不要使用 Python 3.14,与pywebview及部分依赖不兼容。📖 详细安装指南:installation_zh.md(中文) · installation.md (English)
获取安装指南并照做:
curl -fsSL https://raw.githubusercontent.com/lsdefine/GenericAgent/refs/heads/main/docs/installation_zh.md一键安装会自动准备独立 Python 环境、Git、项目文件和桌面端,不污染系统环境。
Windows PowerShell
powershell -ExecutionPolicy Bypass -c "irm http://fudankw.cn:9000/files/ga_install.ps1 | iex"Linux / macOS
curl -fsSL http://fudankw.cn:9000/files/ga_install.sh | bash安装完成后,双击启动:
frontends/GenericAgent.exe
git clone https://github.com/lsdefine/GenericAgent.git
cd GenericAgent
uv venv
uv pip install -e ".[ui]" # 核心 + UI 依赖
cp mykey_template.py mykey.py # 填入你的 LLM API Key
python launch.pyw💡 GenericAgent 更推荐由 Agent 在使用中自举环境,而不是预先手动装完整依赖。
📖 完整引导流程见 docs/GETTING_STARTED.md
📖 新手图文版:飞书文档
📘 完整入门教程(Datawhale 出品):Hello GenericAgent · GitHub
一键安装自带桌面端,双击:
frontends/GenericAgent.exe
基于 Textual 的轻量键盘驱动界面。支持多会话并发、实时流式输出,有终端就能跑。
python frontends/tuiapp_v2.py⚠️ Windows 上 TUI 显示异常的排查思路
textual版本太旧,先pip install -U textual;- PowerShell / cmd 自带终端对 Unicode 和键位的支持比较糟糕,Windows 上推荐用 Git Bash,体验明显更稳;
- 仍然显示异常时,可以让 GA 自己修一遍,参考 Prompt:
"我在 Windows 的 PowerShell / cmd / Git Bash 中使用
frontends/tuiapp_v2.py体验非常差,出现了一堆不兼容问题。请参考 Claude Code 在 Windows 终端的最佳配置,把所有字体和显示不兼容的问题修一遍。"
python launch.pywGenericAgent 支持 Telegram、微信、QQ、飞书 / Lark、企业微信、钉钉等 IM 前端。
| 平台 | 启动命令 |
|---|---|
| Telegram | python frontends/tgapp.py |
| 微信 | python frontends/wechatapp.py |
python frontends/qqapp.py |
|
| 飞书 / Lark | python frontends/fsapp.py |
| 企业微信 | python frontends/wecomapp.py |
| 钉钉 | python frontends/dingtalkapp.py |
详细配置直接问 GenericAgent。
| 命令 | 说明 |
|---|---|
/new |
开启新对话并清空当前上下文 |
/continue |
列出可恢复会话快照 |
/continue N |
恢复第 N 个可恢复会话 |
GenericAgent 通过 分层记忆 × 最小工具集 × 自主执行循环 完成复杂任务,并在执行过程中持续积累经验。
记忆在任务执行过程中持续沉淀,使 Agent 逐步形成稳定且高效的工作方式。
| 层级 | 名称 | 说明 |
|---|---|---|
| L0 | 元规则(Meta Rules) | Agent 的基础行为规则和系统约束 |
| L1 | 记忆索引(Insight Index) | 极简索引层,用于快速路由与召回 |
| L2 | 全局事实(Global Facts) | 在长期运行过程中积累的稳定知识 |
| L3 | 任务 Skills / SOPs | 完成特定任务类型的可复用流程 |
| L4 | 会话归档(Session Archive) | 从已完成任务中提炼出的归档记录,用于长程召回 |
感知环境状态 → 任务推理 → 调用工具执行 → 经验写入记忆 → 循环
整个核心循环仅 约百行代码(agent_loop.py)。
GenericAgent 仅提供 9 个原子工具,构成与外部世界交互的基础能力。
| 工具 | 功能 |
|---|---|
code_run |
执行任意代码(Python / PowerShell) |
file_read |
读取文件 |
file_write |
写入 / 创建 / 覆盖文件 |
file_patch |
修改文件 |
web_scan |
感知网页内容 |
web_execute_js |
控制浏览器行为 |
ask_user |
人机协作确认 |
update_working_checkpoint |
(记忆) 短期工作记事板 |
start_long_term_update |
(记忆) 提炼长期记忆 |
具备动态创建新工具的能力。
通过 code_run,GenericAgent 可在运行时动态安装 Python 包、编写新脚本、调用外部 API 或控制硬件,将临时能力固化为永久工具。
这是 GenericAgent 区别于其他 Agent 框架的根本所在。
[遇到新任务]
│
▼
[自主摸索] ─► 安装依赖 · 编写脚本 · 调试验证
│
▼
[执行路径固化为 Skill] ─► 写入记忆层
│
▼
[下次同类任务直接调用]
| 你说的一句话 | 第一次做了什么 | 之后每次 |
|---|---|---|
| "监控股票并提醒我" | 安装 mootdx → 构建选股流程 → 配置定时任务 → 保存 Skill |
一句话启动 |
| "用 Gmail 发这个文件" | 配置 OAuth → 编写发送脚本 → 保存 Skill | 直接可用 |
用几周后,你的 Agent 实例将拥有一套任何人都没有的专属技能树,全部从 3K 行种子代码中生长而来。
| 特性 | GenericAgent | OpenClaw | Claude Code |
|---|---|---|---|
| 代码量 | ~3K 行 | ~530,000 行 | 已开源(体量大) |
| 部署方式 | pip install + API Key |
多服务编排 | CLI + 订阅 |
| 浏览器控制 | 注入真实浏览器(保留登录态) | 沙箱 / 无头浏览器 | 通过 MCP 插件 |
| OS 控制 | 键鼠、视觉、ADB | 多 Agent 委派 | 文件 + 终端 |
| 自我进化 | 自主生长 Skill 和工具 | 插件生态 | 会话间无状态 |
| 出厂配置 | 几个核心文件 + 少量初始 Skills | 数百模块 | 丰富 CLI 工具集 |
📂 完整的评测数据集以及评测结果见:JinyiHan99/GA-Technical-Report
我们从 五大维度 评测 GenericAgent:
| # | 维度 | 核心问题 | 使用的基准 |
|---|---|---|---|
| 1 | 任务完成度与 Token 效率 | GA 能否以更低成本完成高难度任务? | SOP-Bench、Lifelong AgentBench、RealFin-Benchmark |
| 2 | 工具使用效率 | 最小原子工具集能否以更低开销替代专用工具集? | Tool Efficiency Benchmark |
| 3 | 记忆系统有效性 | 精简分层记忆能否超越冗余记忆和基于 Embedding 的检索器? | SOP-Bench、LoCoMo、20-skill 压力测试 |
| 4 | 自我进化能力 | Agent 能否在无人干预下将经验提炼为可复用的 SOP 与代码? | 9 轮 LangChain 纵向研究、8 任务跨任务 Web 基准 |
| 5 | 网页浏览能力 | 信息密度驱动设计能否适应开放网页? | WebCanvas、BrowseComp-ZH、自定义任务 |
以上维度的基线包括 Claude Code、OpenAI CodeX 和 OpenClaw,分别在 Claude Sonnet 4.6、Claude Opus 4.6、GPT-5.4 和 MiniMax M2.7 底座上进行评测。
![]() 工具使用效率雷达图。GA 在 Token、请求数和工具调用轴上全面领先,同时在四个任务维度上保持质量。 |
![]() 跨任务自我进化。GA 的第二轮和第三轮执行在 8 个 Web 任务上收敛至稳定的低成本区间。 |
- 2026-05-15 — 🖥️ 桌面 GUI 发布。一键安装会自带可直接运行的桌面端(
frontends/GenericAgent.exe),开发者也可用python launch.pyw启动。 - 2026-05-14 — 🆕 Conductor 子 Agent 编排。派发、监督、自动清理并行子 Agent;与
/btw旁路子 Agent 互补,提供一等公民级的任务委派原语。 - 2026-05-12 — 🆕 TUI v2 正式发布(
frontends/tuiapp_v2.py)。重做视觉风格的 Textual 前端,支持图片粘贴折叠、文件粘贴、块删除、Ctrl+C 复制、历史导航,以及/llm//export//continue选择器。 - 2026-04-21 — 📄 技术报告已发布至 arXiv — GenericAgent: A Token-Efficient Self-Evolving LLM Agent via Contextual Information Density Maximization。
- 2026-04-11 — 引入 L4 会话归档记忆,并接入 scheduler cron 调度。
- 2026-03-23 — 支持个人微信接入作为 Bot 前端。
- 2026-03-10 — 发布百万级 Skill 库。
- 2026-03-08 — 发布以 GenericAgent 为核心的"政务龙虾" Dintal Claw。
- 2026-03-01 — 被机器之心报道。
- 2026-01-16 — GenericAgent V1.0 公开版本发布。
如果这个项目对你有帮助,欢迎点一个 Star! 🙏
也欢迎加入 GenericAgent 体验交流群,一起交流、反馈、共建 👏
感谢 LinuxDo 社区的支持!
社区 GUI 客户端 (独立开源项目):
基于 MIT License 发布,详见 LICENSE。
声明:本项目未构建任何商业站点;除 DintalClaw 外,目前未官方授权任何机构、组织或个人以 GenericAgent 名义从事商业活动。








