~3K lines of seed code · 9 atomic tools · ~100-line Agent Loop

A truly self-evolving
autonomous agent framework

Through 9 atomic tools + a ~100-line Agent Loop, GenericAgent grants any LLM system-level control of a local computer — browser, terminal, filesystem, keyboard/mouse, screen vision, and mobile devices (ADB).
Don't preload skills — evolve them.

Quick Start View Source

📄 arXiv Technical Report 📘 Datawhale Tutorial 🧩 Skill Hub

ga · install

# One-line install (Linux / macOS)
$ GLOBAL=1 bash -c "$(curl -fsSL \
    http://fudankw.cn:9000/files/ga_install.sh)"

# Or: developer mode
$ git clone https://github.com/lsdefine/GenericAgent.git
$ uv pip install -e ".[ui]"
$ python launch.pyw

✓ Agent ready · waiting for your first task▋

Layered Memory

L0 Meta Rules
L1 Insight Index
L2 Global Facts
L3 Task Skills
L4 Session Archive

Recommended

One-line Install

Sets up an isolated Python env, Git, and the desktop app — no system pollution.

Linux / macOS

GLOBAL=1 bash -c "$(curl -fsSL \
  http://fudankw.cn:9000/files/ga_install.sh)"

Windows PowerShell

powershell -ExecutionPolicy Bypass -c "$env:GLOBAL=1; `
  irm http://fudankw.cn:9000/files/ga_install.ps1 | iex"

Developer

Python Install

Clone the source, install core + UI deps, and add your LLM API key.

shell

git clone https://github.com/lsdefine/GenericAgent.git
cd GenericAgent
uv venv
uv pip install -e ".[ui]"
cp mykey_template.py mykey.py   # add your API key
python launch.pyw

⚙ Python 3.11 / 3.12 recommended (do not use 3.14).

⟩ Core Features

GenericAgent · Six Core Features

A minimal seed, strong execution, and capabilities that grow as you use it — an agent that hands complexity to evolution.

🧬

Self-Evolving

Crystallizes each task's execution path into a Skill. Capabilities grow with every use, forming your personal skill tree.

🪶

Minimal Architecture

~3K lines of core code; the Agent Loop is ~100 lines. No heavy dependencies, zero deployment overhead.

⚡

Strong Execution

Injects into a real browser (keeps your login session). 9 atomic tools take direct control — browser, terminal, keyboard/mouse, vision, ADB.

🔌

High Compatibility

Supports Claude / Gemini / Kimi / MiniMax and other major models. Cross-platform on Windows / macOS / Linux.

💰

Token-Efficient

Under 30K context window — a fraction of other agents' 200K–1M. Less noise, fewer hallucinations, higher success rate.

🤖

Self-Bootstrap Proof

Everything in this repo — from installing Git and git init to every commit — was done autonomously by GenericAgent. The author never opened a terminal.

⟩ Architecture

Layered Memory × Minimal Toolset × Autonomous Loop

Three pillars work together to complete complex tasks while continuously accumulating experience.

Autonomous Execution Loop

~100lines

Perceive Reason Execute Memorize

Perceive environment → reason → call tools → write experience to memory → loop.

9 Atomic Tools

code_run Run any code file_read Read files file_write Write files file_patch Patch files web_scan Perceive web web_execute_js Control browser ask_user Human-in-the-loop update_working_checkpoint Working notepad start_long_term_update Distill long-term memory

Via code_run, install packages, write scripts, and call external APIs — crystallizing temporary abilities into permanent tools.

LayerNameDescription

L0Meta RulesCore behavioral rules and system constraints

L1Insight IndexMinimal index layer for fast routing and recall

L2Global FactsStable knowledge accumulated over long-term operation

L3Task Skills / SOPsReusable workflows for completing specific task types

L4Session ArchiveArchived records distilled from finished sessions for long-horizon recall

⟩ Self-Evolution

Say it once, learn it for life

This is what fundamentally sets GenericAgent apart from other agent frameworks.

①New taskNew Task

→

②Explore autonomouslyinstall deps · write scripts · debug

→

③Crystallize into Skillwrite to memory

→

④Recall next timejust one sentence

What you sayFirst timeEvery time after

“Read my WeChat messages” install deps → reverse the DB → write read script → save Skill one-line call

“Monitor stocks and alert me” install mootdx → build screening flow → configure cron → save Skill one-line start

“Send this file via Gmail” configure OAuth → write send script → save Skill ready to use

After a few weeks, your agent will have a skill tree no one else in the world has — all grown from 3K lines of seed code.

⟩ Showcase

Already getting real work done

From food delivery to stock screening — it really drives your apps and system.

Food delivery demo — **🧋 Food Delivery Order**“Order me a milk tea” — navigates the delivery app, picks items, and checks out.

Stock screening demo — **📈 Quantitative Screening**“GEM stocks with EXPMA golden cross, turnover > 5%” — quantitative screening.

Autonomous web exploration — **🌐 Autonomous Web Exploration**Autonomously browses and periodically summarizes web content.

Expense tracking — **💰 Expense Tracking**“Find expenses over ¥2K in the last 3 months” — drives Alipay via ADB.

⟩ Comparison

Lighter, cheaper, and it grows

Feature	GenericAgent	OpenClaw	Claude Code
Codebase	~3K lines	~530,000 lines	Open-sourced (large)
Deployment	pip install + API Key	Multi-service orchestration	CLI + subscription
Browser Control	Real browser (session preserved)	Sandbox / headless	Via MCP plugin
OS Control	Mouse/kbd, vision, ADB	Multi-agent delegation	File + terminal
Self-Evolution	Autonomous skill & tool growth	Plugin ecosystem	Stateless between sessions
Out of the Box	Few core files + starter skills	Hundreds of modules	Rich CLI toolset

⟩ Benchmarks

Five dimensions, data-backed

Baselines include Claude Code, OpenAI CodeX, and OpenClaw — evaluated on Claude Sonnet 4.6 / Opus 4.6 / GPT-5.4 / MiniMax M2.7 backbones.

Tool-use efficiency radar: GA leads on token, request, and tool-call axes.

Cross-task self-evolution convergence — Cross-task self-evolution: 2nd & 3rd runs converge to a stable low-cost regime.

1
Task Completion & Token Efficiency
Can GA complete hard tasks more cheaply? · SOP-Bench, Lifelong AgentBench, RealFin
2
Tool-Use Efficiency
Can a minimal atomic toolset replace specialized ones? · Tool Efficiency Benchmark
3
Memory System Effectiveness
Does condensed hierarchical memory beat redundant memory & embedding retrieval? · LoCoMo, 20-skill stress test
4
Self-Evolution Capability
Can it distill reusable SOPs without intervention? · 9-round LangChain longitudinal study
5
Web Browsing Capability
Does density-driven design survive the open web? · WebCanvas, BrowseComp-ZH