diff --git a/AI_GUIDE.md b/AI_GUIDE.md index 92d8d0f..5831d29 100644 --- a/AI_GUIDE.md +++ b/AI_GUIDE.md @@ -46,17 +46,20 @@ Run these commands and report results to the user: ```bash python3 --version # Need 3.10+ nvidia-smi # Need at least 1 GPU -echo $ANTHROPIC_API_KEY # Need Anthropic key -echo $OPENAI_API_KEY # OR OpenAI key (either works) +echo $ANTHROPIC_API_KEY # Anthropic-compatible key, if using provider=anthropic +echo $OPENAI_API_KEY # OpenAI-compatible key, if using provider=openai ``` If Python < 3.10: suggest `conda create -n dra python=3.11 -y && conda activate dra` If no GPU: this framework requires a GPU for training. Suggest cloud GPU (Lambda Labs, RunPod, Vast.ai). -If no API key: guide them to: +If no API key: guide them to either an official endpoint or a compatible provider: - Anthropic: https://console.anthropic.com/ → API Keys → Create Key - OpenAI: https://platform.openai.com/api-keys → Create new secret key +- Qwen / DashScope: create `DASHSCOPE_API_KEY` +- GLM / BigModel: create `ZHIPUAI_API_KEY` +- MiniMax: create `MINIMAX_API_KEY` Then set it: ```bash @@ -79,7 +82,7 @@ cd auto-deep-researcher-24x7 # Install dependencies pip install -r requirements.txt -# Install Claude Code / Codex skills (8 slash commands) +# Install Claude slash commands + Codex local skills python install.py # Verify @@ -88,15 +91,17 @@ python -m core.loop --check **Expected output:** ``` - ✓ /auto-experiment - ✓ /experiment-status - ✓ /gpu-monitor - ✓ /daily-papers - ✓ /paper-analyze - ✓ /conf-search - ✓ /progress-report - ✓ /obsidian-sync - Done! 8 skills installed. + ✓ Claude /auto-experiment + ✓ Claude /experiment-status + ✓ Claude /gpu-monitor + ✓ Claude /daily-papers + ✓ Claude /paper-analyze + ✓ Claude /conf-search + ✓ Claude /progress-report + ✓ Claude /obsidian-sync + ✓ Codex $auto-experiment + ... + Done! 8 Claude commands and 8 Codex skills installed. ``` ### Step 3: Choose Your LLM Provider @@ -109,8 +114,8 @@ Ask the user two questions: | Provider value | Vendor | Billing | Auth | |----------------|--------|---------|------| -| `anthropic` | Anthropic | Per-token API | `ANTHROPIC_API_KEY` env var | -| `openai` | OpenAI | Per-token API | `OPENAI_API_KEY` env var | +| `anthropic` | Anthropic-compatible | Per-token API | `ANTHROPIC_API_KEY` or custom env | +| `openai` | OpenAI-compatible | Per-token API | `OPENAI_API_KEY` or custom env | | `claude_cli` | Anthropic | **Flat-rate subscription** | `claude` CLI installed + logged in | | `codex_cli` | OpenAI | **Flat-rate subscription** | `codex` CLI installed + logged in | @@ -126,6 +131,36 @@ Default is `anthropic`. To switch, edit `config.yaml`: agent: provider: "openai" # or "anthropic" / "claude_cli" / "codex_cli" model: "codex-5.3" # or claude-sonnet-4-6 / claude-opus-4-6 / gpt-5.4 + base_url: "" # optional compatible endpoint override + api_key_env: "" # optional custom key env var + auth_token_env: "" # optional custom bearer token env var +``` + +Compatible API examples +(illustrative only in this repo — these endpoint/model combinations have not +been live-smoke-tested here): + +```yaml +# Qwen / DashScope +agent: + provider: "openai" + model: "qwen-plus" + base_url: "https://dashscope.aliyuncs.com/compatible-mode/v1" + api_key_env: "DASHSCOPE_API_KEY" + +# GLM / BigModel +agent: + provider: "openai" + model: "glm-4.5" + base_url: "https://open.bigmodel.cn/api/paas/v4" + api_key_env: "ZHIPUAI_API_KEY" + +# MiniMax via OpenAI-compatible endpoint +agent: + provider: "openai" + model: "MiniMax-M1" + base_url: "https://api.minimaxi.com/v1" + api_key_env: "MINIMAX_API_KEY" ``` Optional SSH execution mode: diff --git a/CLAUDE.md b/CLAUDE.md index 8bda4ba..6a2567e 100644 --- a/CLAUDE.md +++ b/CLAUDE.md @@ -98,17 +98,20 @@ Run these commands and report results to the user: ```bash python3 --version # Need 3.10+ nvidia-smi # Need at least 1 GPU -echo $ANTHROPIC_API_KEY # Need Anthropic key -echo $OPENAI_API_KEY # OR OpenAI key (either works) +echo $ANTHROPIC_API_KEY # Anthropic-compatible key, if using provider=anthropic +echo $OPENAI_API_KEY # OpenAI-compatible key, if using provider=openai ``` If Python < 3.10: suggest `conda create -n dra python=3.11 -y && conda activate dra` If no GPU: this framework requires a GPU for training. Suggest cloud GPU (Lambda Labs, RunPod, Vast.ai). -If no API key: guide them to: +If no API key: guide them to either an official endpoint or a compatible provider: - Anthropic: https://console.anthropic.com/ → API Keys → Create Key - OpenAI: https://platform.openai.com/api-keys → Create new secret key +- Qwen / DashScope: create `DASHSCOPE_API_KEY` +- GLM / BigModel: create `ZHIPUAI_API_KEY` +- MiniMax: create `MINIMAX_API_KEY` Then set it: ```bash @@ -131,7 +134,7 @@ cd auto-deep-researcher-24x7 # Install dependencies pip install -r requirements.txt -# Install Claude Code / Codex skills (8 slash commands) +# Install Claude slash commands + Codex local skills python install.py # Verify @@ -140,15 +143,17 @@ python -m core.loop --check **Expected output:** ``` - ✓ /auto-experiment - ✓ /experiment-status - ✓ /gpu-monitor - ✓ /daily-papers - ✓ /paper-analyze - ✓ /conf-search - ✓ /progress-report - ✓ /obsidian-sync - Done! 8 skills installed. + ✓ Claude /auto-experiment + ✓ Claude /experiment-status + ✓ Claude /gpu-monitor + ✓ Claude /daily-papers + ✓ Claude /paper-analyze + ✓ Claude /conf-search + ✓ Claude /progress-report + ✓ Claude /obsidian-sync + ✓ Codex $auto-experiment + ... + Done! 8 Claude commands and 8 Codex skills installed. ``` ### Step 3: Choose Your LLM Provider @@ -161,8 +166,8 @@ Ask the user two questions: | Provider value | Vendor | Billing | Auth | |----------------|--------|---------|------| -| `anthropic` | Anthropic | Per-token API | `ANTHROPIC_API_KEY` env var | -| `openai` | OpenAI | Per-token API | `OPENAI_API_KEY` env var | +| `anthropic` | Anthropic-compatible | Per-token API | `ANTHROPIC_API_KEY` or custom env | +| `openai` | OpenAI-compatible | Per-token API | `OPENAI_API_KEY` or custom env | | `claude_cli` | Anthropic | **Flat-rate subscription** | `claude` CLI installed + logged in | | `codex_cli` | OpenAI | **Flat-rate subscription** | `codex` CLI installed + logged in | @@ -178,6 +183,36 @@ Default is `anthropic`. To switch, edit `config.yaml`: agent: provider: "openai" # or "anthropic" / "claude_cli" / "codex_cli" model: "codex-5.3" # or claude-sonnet-4-6 / claude-opus-4-6 / gpt-5.4 + base_url: "" # optional compatible endpoint override + api_key_env: "" # optional custom key env var + auth_token_env: "" # optional custom bearer token env var +``` + +Compatible API examples +(illustrative only in this repo — these endpoint/model combinations have not +been live-smoke-tested here): + +```yaml +# Qwen / DashScope +agent: + provider: "openai" + model: "qwen-plus" + base_url: "https://dashscope.aliyuncs.com/compatible-mode/v1" + api_key_env: "DASHSCOPE_API_KEY" + +# GLM / BigModel +agent: + provider: "openai" + model: "glm-4.5" + base_url: "https://open.bigmodel.cn/api/paas/v4" + api_key_env: "ZHIPUAI_API_KEY" + +# MiniMax via OpenAI-compatible endpoint +agent: + provider: "openai" + model: "MiniMax-M1" + base_url: "https://api.minimaxi.com/v1" + api_key_env: "MINIMAX_API_KEY" ``` **Subscription mode (`claude_cli` / `codex_cli`)** shells out to the headless diff --git a/README.md b/README.md index 131b573..6127e44 100644 --- a/README.md +++ b/README.md @@ -37,6 +37,14 @@ ## Recent Updates +**2026-04-22** +- Added explicit compatible-API configuration for SDK providers: `agent.base_url`, `agent.api_key_env`, and `agent.auth_token_env`. +- `provider: "openai"` now cleanly covers OpenAI-compatible endpoints such as Qwen / GLM / MiniMax, without adding provider-specific branches. +- Added Codex local skill installation alongside Claude Code slash-command installation. `python install.py` now installs into both `~/.codex/skills` and `~/.claude/commands`. +- Added `agents/openai.yaml` metadata for all built-in skills, and normalized the source `SKILL.md` files so the repo-level skills are Codex-compatible without relying on install-time frontmatter cleanup. +- Hardened the installer against half-installed state: it now preflights Codex skill ownership before writing Claude-side artifacts, and refuses to overwrite unowned local Codex skills. +- Updated `README.md`, `AI_GUIDE.md`, `CLAUDE.md`, `config.yaml`, and skill docs with compatible API and dual Claude/Codex skill guidance. + **2026-04-21** - Added an optional `execution.mode: "ssh"` backend so the controller can stay local while code edits, shell commands, training, log reads, PID checks, and GPU queries run on one remote host. - Controller state remains local in SSH mode: `PROJECT_BRIEF.md`, `workspace/MEMORY_LOG.md`, `workspace/state.json`, `workspace/HUMAN_DIRECTIVE.md`, and local progress / Obsidian exports. @@ -83,7 +91,7 @@ Prefer AI-guided setup? Open [`AI_GUIDE.md`](AI_GUIDE.md) in Claude / ChatGPT / |-------------|----------|-------| | Python 3.10+ | Yes | Runtime | | 1+ NVIDIA GPU | Yes | For training | -| API key | Yes | Anthropic or OpenAI | +| API key | Yes | Anthropic-compatible or OpenAI-compatible endpoint | | `PROJECT_BRIEF.md` | Yes | Main control file | | Project `config.yaml` | Optional | Only if you want to override defaults | | Obsidian vault | Optional | If absent, notes fall back to local text files | @@ -386,7 +394,7 @@ cd auto-deep-researcher-24x7 # Install Python dependencies pip install -r requirements.txt -# Install 8 slash commands into Claude Code +# Install 8 Claude slash commands and 8 Codex local skills python install.py # Verify everything works @@ -398,17 +406,18 @@ You should see: Deep Researcher Agent — Installer ======================================== - ✓ /auto-experiment - ✓ /experiment-status - ✓ /gpu-monitor - ✓ /daily-papers - ✓ /paper-analyze - ✓ /conf-search - ✓ /progress-report - - ✓ /obsidian-sync - - Done! 8 skills installed. + ✓ Claude /auto-experiment + ✓ Claude /experiment-status + ✓ Claude /gpu-monitor + ✓ Claude /daily-papers + ✓ Claude /paper-analyze + ✓ Claude /conf-search + ✓ Claude /progress-report + ✓ Claude /obsidian-sync + ✓ Codex $auto-experiment + ... + + Done! 8 Claude commands and 8 Codex skills installed. ``` ### Step 2: Create Your First Project @@ -811,15 +820,18 @@ Yes. The agent works with any training framework. It just launches shell command --- -## One-Click Install (Claude Code Skills) +## One-Click Install (Claude + Codex) -All features are packaged as Claude Code slash commands. **One command to install:** +All features are packaged as Claude Code slash commands and Codex local skills. +**One command to install:** ```bash python install.py ``` -After installation, you get **8 slash commands** in Claude Code: +After installation, you get: +- **8 slash commands** in Claude Code +- **8 local skills** in Codex (restart Codex after install) ### Core Skills @@ -845,9 +857,12 @@ After installation, you get **8 slash commands** in Claude Code: # Step 1: Install skills (one time) python install.py -# Step 2: In Claude Code, launch an experiment loop +# Step 2a: In Claude Code, launch an experiment loop /auto-experiment --project /path/to/my_project --gpu 0 +# Step 2b: In Codex, use the matching local skill +$auto-experiment + # Step 3: Check how it's going /experiment-status --project /path/to/my_project @@ -868,8 +883,9 @@ python install.py --uninstall ## Supported LLM Providers -Works with **both Anthropic and OpenAI** out of the box, and can run on a -**flat-rate subscription** instead of per-token billing via the local CLIs. +Works with **Anthropic-compatible and OpenAI-compatible APIs** out of the box, +and can also run on a **flat-rate subscription** instead of per-token billing +via the local CLIs. | Tier | Anthropic (Claude) | OpenAI (Codex/GPT) | Best For | |------|-------------------|-------------------|----------| @@ -880,8 +896,8 @@ Works with **both Anthropic and OpenAI** out of the box, and can run on a | Mode | `provider` value | Billing | Requires | Tool-use support | |------|------------------|---------|----------|------------------| -| API — Anthropic | `anthropic` | Per-token, via `ANTHROPIC_API_KEY` | `pip install anthropic` | ✅ Full | -| API — OpenAI | `openai` | Per-token, via `OPENAI_API_KEY` | `pip install openai` | ✅ Full | +| API — Anthropic-compatible | `anthropic` | Per-token, via `ANTHROPIC_API_KEY` or custom env | `pip install anthropic` | ✅ Full | +| API — OpenAI-compatible | `openai` | Per-token, via `OPENAI_API_KEY` or custom env | `pip install openai` | ✅ Full | | **Subscription — Claude** | `claude_cli` | Flat-rate, uses your Claude Code / Pro / Max plan | `claude` CLI installed and logged in | ✅ Full | | **Subscription — ChatGPT** | `codex_cli` | Flat-rate, uses your ChatGPT Plus / Pro plan | `codex` CLI installed and logged in | ⚠️ Leader only | @@ -900,18 +916,49 @@ agent: # Pay-per-token (needs API key): provider: "anthropic" # or "openai" model: "claude-sonnet-4-6" # or "codex-5.3" + base_url: "" # optional compatible endpoint override + api_key_env: "" # optional custom key env var name + auth_token_env: "" # optional custom bearer token env var # Flat-rate subscription (needs CLI login instead of API key): # provider: "claude_cli" # or "codex_cli" ``` +Compatible API examples +(illustrative only in this repo — these endpoint/model combinations have not +been live-smoke-tested here): +```yaml +# Qwen / DashScope +agent: + provider: "openai" + model: "qwen-plus" + base_url: "https://dashscope.aliyuncs.com/compatible-mode/v1" + api_key_env: "DASHSCOPE_API_KEY" + +# GLM / BigModel +agent: + provider: "openai" + model: "glm-4.5" + base_url: "https://open.bigmodel.cn/api/paas/v4" + api_key_env: "ZHIPUAI_API_KEY" + +# MiniMax via OpenAI-compatible endpoint +agent: + provider: "openai" + model: "MiniMax-M1" + base_url: "https://api.minimaxi.com/v1" + api_key_env: "MINIMAX_API_KEY" +``` + Or set via environment (API-key modes only): ```bash -# For API-key "anthropic" provider: +# For Anthropic-compatible provider: export ANTHROPIC_API_KEY="sk-ant-xxxxx" +export ANTHROPIC_BASE_URL="https://your-anthropic-compatible-endpoint" -# For API-key "openai" provider: +# For OpenAI-compatible provider: export OPENAI_API_KEY="sk-xxxxx" +export OPENAI_BASE_URL="https://your-openai-compatible-endpoint/v1" # For subscription providers (claude_cli / codex_cli): no env var — just # install the CLI once and run `claude` or `codex login` to sign in. @@ -960,6 +1007,9 @@ execution: agent: provider: "anthropic" # "anthropic" or "openai" model: "claude-sonnet-4-6" # See model table above + base_url: "" # Optional compatible API endpoint override + api_key_env: "" # Optional custom API key env var + auth_token_env: "" # Optional custom bearer token env var max_cycles: -1 # -1 = run forever max_steps_per_cycle: 3 # Max worker dispatches per cycle cooldown_interval: 300 # Smart cooldown polling (seconds) @@ -1013,7 +1063,7 @@ auto-deep-researcher-24x7/ │ ├── monitor.py # Zero-LLM experiment monitoring │ ├── agents.py # Leader-Worker agent dispatch │ └── tools.py # Minimal per-agent tool registry -├── skills/ # Claude Code slash commands (python install.py) +├── skills/ # Source skills for Claude slash commands + Codex local skills │ ├── auto-experiment/ # 24/7 autonomous experiment loop │ ├── experiment-status/ # Check experiment progress │ ├── gpu-monitor/ # GPU status & availability @@ -1031,7 +1081,7 @@ auto-deep-researcher-24x7/ │ └── keeper.py # Cloud instance keep-alive ├── examples/ # Ready-to-run demos ├── docs/ # Docs + translations (CN/JP) -├── install.py # Claude Code skill installer +├── install.py # Claude + Codex skill installer ├── config.yaml # Default configuration └── requirements.txt # Dependencies ``` diff --git a/config.yaml b/config.yaml index 5b9f924..8706714 100644 --- a/config.yaml +++ b/config.yaml @@ -18,8 +18,8 @@ execution: agent: # Provider: - # "anthropic" — SDK, per-token API billing (ANTHROPIC_API_KEY) - # "openai" — SDK, per-token API billing (OPENAI_API_KEY) + # "anthropic" — Anthropic-compatible SDK endpoint (default auth env: ANTHROPIC_API_KEY) + # "openai" — OpenAI-compatible SDK endpoint (default auth env: OPENAI_API_KEY) # "claude_cli" — subprocess, reuses Claude Code / Pro / Max subscription # "codex_cli" — subprocess, reuses ChatGPT Plus / Pro subscription # @@ -40,9 +40,13 @@ agent: # Model selection per provider: # anthropic: claude-sonnet-4-6 (fast) | claude-opus-4-6 (strongest) # openai: codex-5.3 (fast) | gpt-5.4 (strongest) + # or any OpenAI-compatible model id such as qwen / glm / minimax # claude_cli: whichever model your `claude` CLI is configured to use # codex_cli: whichever model your `codex` CLI is configured to use model: "claude-sonnet-4-6" + base_url: "" # Optional compatible API endpoint override + api_key_env: "" # Optional custom key env, e.g. DASHSCOPE_API_KEY + auth_token_env: "" # Optional bearer-token env for anthropic-compatible endpoints max_cycles: -1 # -1 = run forever max_steps_per_cycle: 3 # Max worker dispatches per cycle diff --git a/core/agents.py b/core/agents.py index 11f38aa..0129d4d 100644 --- a/core/agents.py +++ b/core/agents.py @@ -19,6 +19,7 @@ import json import logging +import os import re from pathlib import Path from typing import Optional @@ -77,13 +78,23 @@ class AgentDispatcher: } # Supported providers: - # "anthropic" — Anthropic SDK, per-token API billing (needs ANTHROPIC_API_KEY) - # "openai" — OpenAI SDK, per-token API billing (needs OPENAI_API_KEY) + # "anthropic" — Anthropic-compatible SDK endpoint (default auth env: ANTHROPIC_API_KEY) + # "openai" — OpenAI-compatible SDK endpoint (default auth env: OPENAI_API_KEY) # "claude_cli" — `claude -p` subprocess, uses Claude Code / Pro / Max subscription # "codex_cli" — `codex exec` subprocess, uses ChatGPT Plus / Pro subscription SUPPORTED_PROVIDERS = ("anthropic", "openai", "claude_cli", "codex_cli") - def __init__(self, model: str = "claude-sonnet-4-6", provider: str = "anthropic", max_steps: int = 3): + def __init__( + self, + model: str = "claude-sonnet-4-6", + provider: str = "anthropic", + max_steps: int = 3, + base_url: Optional[str] = None, + api_key: Optional[str] = None, + api_key_env: str = "", + auth_token: Optional[str] = None, + auth_token_env: str = "", + ): if provider not in self.SUPPORTED_PROVIDERS: raise ValueError( f"Unknown provider '{provider}'. Supported: {self.SUPPORTED_PROVIDERS}" @@ -91,8 +102,18 @@ def __init__(self, model: str = "claude-sonnet-4-6", provider: str = "anthropic" self.model = model self.provider = provider self.max_steps = max_steps + self.base_url = (base_url or "").strip() or None + self.api_key = api_key or self._resolve_secret(api_key_env) + self.auth_token = auth_token or self._resolve_secret(auth_token_env) self._leader_history = [] + @staticmethod + def _resolve_secret(env_name: str) -> Optional[str]: + env_name = (env_name or "").strip() + if not env_name: + return None + return os.environ.get(env_name) + def dispatch_leader(self, task: str, context: dict) -> dict: """Send a task to the Leader agent. @@ -308,8 +329,8 @@ def _render_tools_section(tool_defs: list[dict]) -> str: def _call_llm(self, system: str, messages: list) -> str: """Call the LLM. Four providers are supported. - - "anthropic": Claude SDK, per-token API billing - - "openai": OpenAI SDK, per-token API billing + - "anthropic": Anthropic-compatible SDK endpoint, per-token API billing + - "openai": OpenAI-compatible SDK endpoint, per-token API billing - "claude_cli": `claude -p` subprocess, uses Claude Code / Pro / Max subscription - "codex_cli": `codex exec` subprocess, uses ChatGPT Plus / Pro subscription @@ -328,11 +349,18 @@ def _call_llm(self, system: str, messages: list) -> str: return self._call_anthropic(system, messages) def _call_anthropic(self, system: str, messages: list) -> str: - """Call Anthropic Claude API.""" + """Call an Anthropic-compatible Messages API.""" try: import anthropic - client = anthropic.Anthropic() + client_kwargs = {} + if self.base_url: + client_kwargs["base_url"] = self.base_url + if self.api_key: + client_kwargs["api_key"] = self.api_key + if self.auth_token: + client_kwargs["auth_token"] = self.auth_token + client = anthropic.Anthropic(**client_kwargs) api_messages = [] for msg in messages: @@ -356,11 +384,16 @@ def _call_anthropic(self, system: str, messages: list) -> str: return self._call_openai(system, messages) def _call_openai(self, system: str, messages: list) -> str: - """Call OpenAI API (Codex 5.3 / GPT 5.4).""" + """Call an OpenAI-compatible chat completions API.""" try: import openai - client = openai.OpenAI() + client_kwargs = {} + if self.base_url: + client_kwargs["base_url"] = self.base_url + if self.api_key: + client_kwargs["api_key"] = self.api_key + client = openai.OpenAI(**client_kwargs) # Map model name if it's an Anthropic model name model = self.MODEL_MAP.get(self.model, self.model) if self.provider != "openai" else self.model diff --git a/core/loop.py b/core/loop.py index 09cf5b3..c0ddae1 100644 --- a/core/loop.py +++ b/core/loop.py @@ -55,10 +55,14 @@ def __init__(self, config: dict, project_dir: str): zero_llm=config.get("monitor", {}).get("zero_llm", True), backend=self.execution_backend, ) + agent_config = config.get("agent", {}) or {} self.dispatcher = AgentDispatcher( - model=config.get("agent", {}).get("model", "claude-sonnet-4-6"), - provider=config.get("agent", {}).get("provider", "anthropic"), - max_steps=config.get("agent", {}).get("max_steps_per_cycle", 3), + model=agent_config.get("model", "claude-sonnet-4-6"), + provider=agent_config.get("provider", "anthropic"), + max_steps=agent_config.get("max_steps_per_cycle", 3), + base_url=agent_config.get("base_url", ""), + api_key_env=agent_config.get("api_key_env", ""), + auth_token_env=agent_config.get("auth_token_env", ""), ) self.tools = ToolRegistry(self.execution_backend) self.obsidian = ObsidianExporter( @@ -69,9 +73,9 @@ def __init__(self, config: dict, project_dir: str): # State self.cycle_count = self._load_cycle_counter() - self.max_cycles = config.get("agent", {}).get("max_cycles", -1) - self.cooldown = config.get("agent", {}).get("cooldown_interval", 300) - self.no_progress_fallback_threshold = config.get("agent", {}).get("no_progress_fallback_threshold", 3) + self.max_cycles = agent_config.get("max_cycles", -1) + self.cooldown = agent_config.get("cooldown_interval", 300) + self.no_progress_fallback_threshold = agent_config.get("no_progress_fallback_threshold", 3) self._running = True self._no_progress_streak = 0 self._last_no_progress_signature = "" diff --git a/install.py b/install.py index 5155fb6..6066358 100644 --- a/install.py +++ b/install.py @@ -1,76 +1,157 @@ """ -Install Deep Researcher Agent skills into Claude Code. +Install Deep Researcher Agent integrations into Claude Code and Codex. One-command setup: python install.py -After installation, these slash commands are available in Claude Code: - /auto-experiment — Launch 24/7 autonomous experiment loop - /experiment-status — Check running experiment status - /gpu-monitor — GPU status and availability - /daily-papers — Daily arXiv paper recommendations - /paper-analyze — Deep paper analysis with figure extraction - /conf-search — Search top conference papers - /progress-report — Generate structured progress report - /obsidian-sync — Refresh Obsidian dashboard and daily notes +After installation: + - Claude Code slash commands are copied into ~/.claude/commands + - Codex local skills are copied into ~/.codex/skills """ +import re import shutil import sys from pathlib import Path +import yaml + CLAUDE_DIR = Path.home() / ".claude" +CODEX_DIR = Path.home() / ".codex" REPO_DIR = Path(__file__).parent SKILLS_SOURCE = REPO_DIR / "skills" CORE_SOURCE = REPO_DIR / "core" GPU_SOURCE = REPO_DIR / "gpu" +CODEX_ALLOWED_FRONTMATTER = {"name", "description", "license", "allowed-tools", "metadata"} +CODEX_INSTALL_MARKER = ".deep-researcher-installed" + + +def _iter_skill_dirs(skills_source: Path): + for skill_dir in sorted(skills_source.iterdir()): + if skill_dir.is_dir() and (skill_dir / "SKILL.md").exists(): + yield skill_dir + +def _sync_python_modules(source_dir: Path, dest_dir: Path): + dest_dir.mkdir(parents=True, exist_ok=True) + if source_dir.exists(): + for py_file in source_dir.glob("*.py"): + shutil.copy2(py_file, dest_dir / py_file.name) -def install(): + +def _install_runtime_bundle(home_dir: Path, repo_dir: Path): + bundle_dir = home_dir / "deep-researcher" + _sync_python_modules(repo_dir / "core", bundle_dir / "core") + _sync_python_modules(repo_dir / "gpu", bundle_dir / "gpu") + + config_src = repo_dir / "config.yaml" + config_dest = bundle_dir / "config.yaml" + if config_src.exists() and not config_dest.exists(): + config_dest.parent.mkdir(parents=True, exist_ok=True) + shutil.copy2(config_src, config_dest) + + +def _check_codex_conflicts(skills_source: Path, codex_dir: Path): + codex_skills_dir = codex_dir / "skills" + for skill_dir in _iter_skill_dirs(skills_source): + dest_dir = codex_skills_dir / skill_dir.name + if dest_dir.exists(): + marker = dest_dir / CODEX_INSTALL_MARKER + if not marker.exists(): + raise RuntimeError( + f"Refusing to overwrite existing Codex skill '{skill_dir.name}' " + f"at {dest_dir}; marker file not found." + ) + + +def _parse_frontmatter(skill_text: str): + match = re.match(r"^---\n(.*?)\n---\n?(.*)$", skill_text, re.DOTALL) + if not match: + raise ValueError("Skill file must start with YAML frontmatter") + frontmatter = yaml.safe_load(match.group(1)) or {} + if not isinstance(frontmatter, dict): + raise ValueError("Skill frontmatter must be a YAML dictionary") + body = match.group(2) + return frontmatter, body + + +def _build_codex_skill_text(skill_text: str) -> str: + frontmatter, body = _parse_frontmatter(skill_text) + filtered_frontmatter = { + key: value + for key, value in frontmatter.items() + if key in CODEX_ALLOWED_FRONTMATTER + } + skill_name = str(filtered_frontmatter.get("name", "")).strip() + codex_note = ( + f"> Codex note: invoke explicitly as `${skill_name}` when needed. " + f"The original repo docs may also show `/{skill_name}` because the same " + "source skill powers Claude Code slash commands.\n\n" + ) + rendered_frontmatter = yaml.safe_dump( + filtered_frontmatter, + sort_keys=False, + allow_unicode=True, + ).strip() + return f"---\n{rendered_frontmatter}\n---\n\n{codex_note}{body.lstrip()}" + + +def _install_claude_commands(skills_source: Path, claude_dir: Path) -> int: + claude_commands = claude_dir / "commands" + claude_commands.mkdir(parents=True, exist_ok=True) + + installed = 0 + for skill_dir in _iter_skill_dirs(skills_source): + dest = claude_commands / f"{skill_dir.name}.md" + shutil.copy2(skill_dir / "SKILL.md", dest) + print(f" ✓ Claude /{skill_dir.name}") + installed += 1 + return installed + + +def _install_codex_skills(skills_source: Path, codex_dir: Path) -> int: + codex_skills_dir = codex_dir / "skills" + codex_skills_dir.mkdir(parents=True, exist_ok=True) + + installed = 0 + for skill_dir in _iter_skill_dirs(skills_source): + dest_dir = codex_skills_dir / skill_dir.name + if dest_dir.exists(): + shutil.rmtree(dest_dir) + shutil.copytree(skill_dir, dest_dir) + skill_text = (skill_dir / "SKILL.md").read_text() + (dest_dir / "SKILL.md").write_text(_build_codex_skill_text(skill_text)) + (dest_dir / CODEX_INSTALL_MARKER).write_text("installed by Deep Researcher Agent\n") + print(f" ✓ Codex ${skill_dir.name}") + installed += 1 + return installed + + +def install( + claude_dir: Path = CLAUDE_DIR, + codex_dir: Path = CODEX_DIR, + repo_dir: Path = REPO_DIR, +): print() print(" Deep Researcher Agent — Installer") print(" " + "=" * 40) print() - # 1. Install skills as Claude Code slash commands - claude_commands = CLAUDE_DIR / "commands" - claude_commands.mkdir(parents=True, exist_ok=True) - - installed = 0 - for skill_dir in sorted(SKILLS_SOURCE.iterdir()): - if skill_dir.is_dir(): - skill_file = skill_dir / "SKILL.md" - if skill_file.exists(): - dest = claude_commands / f"{skill_dir.name}.md" - shutil.copy2(skill_file, dest) - print(f" ✓ /{skill_dir.name}") - installed += 1 - - # 2. Install core module (for programmatic use) - core_dest = CLAUDE_DIR / "deep-researcher" / "core" - core_dest.mkdir(parents=True, exist_ok=True) - if CORE_SOURCE.exists(): - for py_file in CORE_SOURCE.glob("*.py"): - shutil.copy2(py_file, core_dest / py_file.name) - - gpu_dest = CLAUDE_DIR / "deep-researcher" / "gpu" - gpu_dest.mkdir(parents=True, exist_ok=True) - if GPU_SOURCE.exists(): - for py_file in GPU_SOURCE.glob("*.py"): - shutil.copy2(py_file, gpu_dest / py_file.name) - - # 3. Copy default config - config_src = REPO_DIR / "config.yaml" - config_dest = CLAUDE_DIR / "deep-researcher" / "config.yaml" - if config_src.exists() and not config_dest.exists(): - shutil.copy2(config_src, config_dest) + skills_source = repo_dir / "skills" + _check_codex_conflicts(skills_source, codex_dir) + claude_count = _install_claude_commands(skills_source, claude_dir) + codex_count = _install_codex_skills(skills_source, codex_dir) + _install_runtime_bundle(claude_dir, repo_dir) + _install_runtime_bundle(codex_dir, repo_dir) - # Summary print() - print(f" Done! {installed} skills installed.") + print( + " Done! " + f"{claude_count} Claude commands and {codex_count} Codex skills installed." + ) print() - print(" Available commands in Claude Code:") + print(" Available in Claude Code:") print(" ─────────────────────────────────────") print(" /auto-experiment Launch 24/7 experiment loop") print(" /experiment-status Check experiment progress") @@ -81,30 +162,62 @@ def install(): print(" /progress-report Generate progress report") print(" /obsidian-sync Refresh Obsidian notes") print() + print(" Available in Codex:") + print(" ─────────────────────────────────────") + print(" $auto-experiment Launch 24/7 experiment loop") + print(" $experiment-status Check experiment progress") + print(" $gpu-monitor GPU status & availability") + print(" $daily-papers arXiv paper recommendations") + print(" $paper-analyze Deep paper analysis") + print(" $conf-search Conference paper search") + print(" $progress-report Generate progress report") + print(" $obsidian-sync Refresh Obsidian notes") + print() print(" Quick start:") print(" 1. Create a project with PROJECT_BRIEF.md") - print(" 2. Run: /auto-experiment --project --gpu 0") + print(" 2. Claude: /auto-experiment --project --gpu 0") + print(" 3. Codex: use $auto-experiment for the same workflow") + print() + print(" Restart Codex to pick up newly installed local skills.") print() -def uninstall(): +def uninstall( + claude_dir: Path = CLAUDE_DIR, + codex_dir: Path = CODEX_DIR, + repo_dir: Path = REPO_DIR, +): """Remove all installed skills.""" - claude_commands = CLAUDE_DIR / "commands" - removed = 0 - for skill_dir in sorted(SKILLS_SOURCE.iterdir()): - if skill_dir.is_dir(): - dest = claude_commands / f"{skill_dir.name}.md" - if dest.exists(): - dest.unlink() - print(f" ✗ /{skill_dir.name}") - removed += 1 - - deep_dir = CLAUDE_DIR / "deep-researcher" - if deep_dir.exists(): - shutil.rmtree(deep_dir) - print(" ✗ core modules") - - print(f"\n Removed {removed} skills.") + removed_claude = 0 + claude_commands = claude_dir / "commands" + for skill_dir in _iter_skill_dirs(repo_dir / "skills"): + dest = claude_commands / f"{skill_dir.name}.md" + if dest.exists(): + dest.unlink() + print(f" ✗ Claude /{skill_dir.name}") + removed_claude += 1 + + removed_codex = 0 + codex_skills = codex_dir / "skills" + for skill_dir in _iter_skill_dirs(repo_dir / "skills"): + dest_dir = codex_skills / skill_dir.name + if dest_dir.exists(): + marker = dest_dir / CODEX_INSTALL_MARKER + if marker.exists(): + shutil.rmtree(dest_dir) + print(f" ✗ Codex ${skill_dir.name}") + removed_codex += 1 + + for home_dir, label in ((claude_dir, "Claude"), (codex_dir, "Codex")): + deep_dir = home_dir / "deep-researcher" + if deep_dir.exists(): + shutil.rmtree(deep_dir) + print(f" ✗ {label} runtime bundle") + + print( + f"\n Removed {removed_claude} Claude commands and " + f"{removed_codex} Codex skills." + ) if __name__ == "__main__": diff --git a/skills/auto-experiment/SKILL.md b/skills/auto-experiment/SKILL.md index 4582ba8..e85cef0 100644 --- a/skills/auto-experiment/SKILL.md +++ b/skills/auto-experiment/SKILL.md @@ -1,10 +1,9 @@ --- name: auto-experiment description: "Launch an autonomous THINK→EXECUTE→REFLECT experiment loop on a GPU project" -argument-hint: "[--project ] [--gpu ] [--max-cycles ]" --- -# /auto-experiment +# auto-experiment Launch an autonomous experiment agent that runs your deep learning experiments 24/7. @@ -24,9 +23,10 @@ This skill starts a **THINK → EXECUTE → REFLECT** loop that: ## Usage ``` -/auto-experiment -/auto-experiment --project /path/to/my_project --gpu 0 -/auto-experiment --project . --max-cycles 5 +Claude Code: /auto-experiment +Claude Code: /auto-experiment --project /path/to/my_project --gpu 0 +Claude Code: /auto-experiment --project . --max-cycles 5 +Codex: $auto-experiment ``` ## Prerequisites @@ -59,7 +59,11 @@ Override default agent settings: ```yaml agent: + provider: "anthropic" # or "openai" / "claude_cli" / "codex_cli" model: "claude-sonnet-4-6" + base_url: "" # optional compatible endpoint override + api_key_env: "" # optional custom key env var + auth_token_env: "" # optional custom bearer token env var max_cycles: -1 # -1 = unlimited max_steps_per_cycle: 3 # max sub-agent dispatches per cycle cooldown_interval: 300 # 5 min smart polling @@ -76,6 +80,10 @@ experiment: mandatory_dry_run: true ``` +If the user wants a compatible API endpoint instead of the official Anthropic +or OpenAI API, keep the same `provider` values and set `base_url` plus a custom +`api_key_env`. Do not invent provider names like `qwen` or `glm`. + Optional remote execution over SSH: ```yaml diff --git a/skills/auto-experiment/agents/openai.yaml b/skills/auto-experiment/agents/openai.yaml new file mode 100644 index 0000000..a22dc26 --- /dev/null +++ b/skills/auto-experiment/agents/openai.yaml @@ -0,0 +1,4 @@ +interface: + display_name: "Auto Experiment" + short_description: "Launch a 24/7 autonomous experiment loop" + default_prompt: "Use $auto-experiment to launch or resume a Deep Researcher experiment loop for this project." diff --git a/skills/conf-search/SKILL.md b/skills/conf-search/SKILL.md index 01897ed..9a11fb7 100644 --- a/skills/conf-search/SKILL.md +++ b/skills/conf-search/SKILL.md @@ -3,15 +3,16 @@ name: conf-search description: "Search papers from top AI/ML conferences" --- -# /conf-search +# conf-search Search for papers from top venues. ## Usage ``` -/conf-search --venue CVPR2025 --query "gesture generation" -/conf-search --venue NeurIPS2025 --query "diffusion models" +Claude Code: /conf-search --venue CVPR2025 --query "gesture generation" +Claude Code: /conf-search --venue NeurIPS2025 --query "diffusion models" +Codex: $conf-search ``` ## Supported Venues diff --git a/skills/conf-search/agents/openai.yaml b/skills/conf-search/agents/openai.yaml new file mode 100644 index 0000000..8597f3d --- /dev/null +++ b/skills/conf-search/agents/openai.yaml @@ -0,0 +1,4 @@ +interface: + display_name: "Conference Search" + short_description: "Search top-conference papers by venue and query" + default_prompt: "Use $conf-search to search papers from top conferences by venue and query." diff --git a/skills/daily-papers/SKILL.md b/skills/daily-papers/SKILL.md index 48b52ef..67e39e2 100644 --- a/skills/daily-papers/SKILL.md +++ b/skills/daily-papers/SKILL.md @@ -3,10 +3,12 @@ name: daily-papers description: "Daily arXiv paper recommendations with automatic deduplication" --- -# /daily-papers +# daily-papers Search arXiv for the latest papers relevant to the user's research interests. +Invoke as `/daily-papers` in Claude Code or `$daily-papers` in Codex. + ## Behavior 1. Ask the user for topics if not provided (or use defaults from config) diff --git a/skills/daily-papers/agents/openai.yaml b/skills/daily-papers/agents/openai.yaml new file mode 100644 index 0000000..50b3e85 --- /dev/null +++ b/skills/daily-papers/agents/openai.yaml @@ -0,0 +1,4 @@ +interface: + display_name: "Daily Papers" + short_description: "Fetch daily arXiv recommendations and deduplicate them" + default_prompt: "Use $daily-papers to generate daily arXiv recommendations relevant to the current research direction." diff --git a/skills/experiment-status/SKILL.md b/skills/experiment-status/SKILL.md index 04b98d6..7b07938 100644 --- a/skills/experiment-status/SKILL.md +++ b/skills/experiment-status/SKILL.md @@ -3,15 +3,16 @@ name: experiment-status description: "Check status of running autonomous experiment loops" --- -# /experiment-status +# experiment-status Check the current status of your autonomous experiment agent. ## Usage ``` -/experiment-status -/experiment-status --project /path/to/project +Claude Code: /experiment-status +Claude Code: /experiment-status --project /path/to/project +Codex: $experiment-status ``` ## Behavior diff --git a/skills/experiment-status/agents/openai.yaml b/skills/experiment-status/agents/openai.yaml new file mode 100644 index 0000000..f5c6ca3 --- /dev/null +++ b/skills/experiment-status/agents/openai.yaml @@ -0,0 +1,4 @@ +interface: + display_name: "Experiment Status" + short_description: "Inspect experiment progress, logs, and GPU status" + default_prompt: "Use $experiment-status to inspect the current Deep Researcher loop, training logs, and GPU status." diff --git a/skills/gpu-monitor/SKILL.md b/skills/gpu-monitor/SKILL.md index c3bee49..c8a9d07 100644 --- a/skills/gpu-monitor/SKILL.md +++ b/skills/gpu-monitor/SKILL.md @@ -3,15 +3,16 @@ name: gpu-monitor description: "Check GPU status, running experiments, and available resources" --- -# /gpu-monitor +# gpu-monitor Quick GPU status check for experiment management. ## Usage ``` -/gpu-monitor -/gpu-monitor --server user@remote-host +Claude Code: /gpu-monitor +Claude Code: /gpu-monitor --server user@remote-host +Codex: $gpu-monitor ``` ## Behavior diff --git a/skills/gpu-monitor/agents/openai.yaml b/skills/gpu-monitor/agents/openai.yaml new file mode 100644 index 0000000..0aeaf56 --- /dev/null +++ b/skills/gpu-monitor/agents/openai.yaml @@ -0,0 +1,4 @@ +interface: + display_name: "GPU Monitor" + short_description: "Check GPU usage, free devices, and running jobs" + default_prompt: "Use $gpu-monitor to check current GPU usage, free devices, and running processes." diff --git a/skills/obsidian-sync/SKILL.md b/skills/obsidian-sync/SKILL.md index 2472293..58a2adf 100644 --- a/skills/obsidian-sync/SKILL.md +++ b/skills/obsidian-sync/SKILL.md @@ -1,19 +1,19 @@ --- name: obsidian-sync description: "Refresh Obsidian dashboard and daily notes from current experiment state" -argument-hint: "[--project ] [--dashboard-only] [--daily-only]" --- -# /obsidian-sync +# obsidian-sync Refresh progress notes for a Deep Researcher project. ## Usage ```bash -/obsidian-sync --project /path/to/project -/obsidian-sync --project /path/to/project --dashboard-only -/obsidian-sync --project /path/to/project --daily-only +Claude Code: /obsidian-sync --project /path/to/project +Claude Code: /obsidian-sync --project /path/to/project --dashboard-only +Claude Code: /obsidian-sync --project /path/to/project --daily-only +Codex: $obsidian-sync ``` ## Behavior diff --git a/skills/obsidian-sync/agents/openai.yaml b/skills/obsidian-sync/agents/openai.yaml new file mode 100644 index 0000000..dbd4d3a --- /dev/null +++ b/skills/obsidian-sync/agents/openai.yaml @@ -0,0 +1,4 @@ +interface: + display_name: "Obsidian Sync" + short_description: "Refresh the Obsidian dashboard and daily notes" + default_prompt: "Use $obsidian-sync to refresh the Obsidian dashboard or local progress notes for this project." diff --git a/skills/paper-analyze/SKILL.md b/skills/paper-analyze/SKILL.md index 7855883..6bc43c7 100644 --- a/skills/paper-analyze/SKILL.md +++ b/skills/paper-analyze/SKILL.md @@ -3,14 +3,15 @@ name: paper-analyze description: "Deep analysis of a single paper with figure extraction from arXiv source" --- -# /paper-analyze +# paper-analyze Perform deep analysis of a single academic paper. ## Usage ``` -/paper-analyze +Claude Code: /paper-analyze +Codex: $paper-analyze ``` ## Behavior diff --git a/skills/paper-analyze/agents/openai.yaml b/skills/paper-analyze/agents/openai.yaml new file mode 100644 index 0000000..70183b0 --- /dev/null +++ b/skills/paper-analyze/agents/openai.yaml @@ -0,0 +1,4 @@ +interface: + display_name: "Paper Analyze" + short_description: "Analyze one paper deeply with figure extraction" + default_prompt: "Use $paper-analyze to deeply analyze a paper and extract its figures when possible." diff --git a/skills/progress-report/SKILL.md b/skills/progress-report/SKILL.md index d887704..d9a4c48 100644 --- a/skills/progress-report/SKILL.md +++ b/skills/progress-report/SKILL.md @@ -3,10 +3,12 @@ name: progress-report description: "Generate structured research progress reports" --- -# /progress-report +# progress-report Generate a structured progress report for the current research project. +Invoke as `/progress-report` in Claude Code or `$progress-report` in Codex. + ## Behavior 1. Read the project's MEMORY_LOG.md for milestones and decisions diff --git a/skills/progress-report/agents/openai.yaml b/skills/progress-report/agents/openai.yaml new file mode 100644 index 0000000..f08d219 --- /dev/null +++ b/skills/progress-report/agents/openai.yaml @@ -0,0 +1,4 @@ +interface: + display_name: "Progress Report" + short_description: "Write a structured report from recent experiments" + default_prompt: "Use $progress-report to write a structured research progress report from recent experiment history." diff --git a/tests/test_install.py b/tests/test_install.py new file mode 100644 index 0000000..f34a4bf --- /dev/null +++ b/tests/test_install.py @@ -0,0 +1,110 @@ +import tempfile +import unittest +from pathlib import Path + +import install + + +class InstallHelpersTests(unittest.TestCase): + def test_build_codex_skill_text_strips_argument_hint(self): + source = """--- +name: auto-experiment +description: "Launch experiment loop" +argument-hint: "[--project ]" +--- + +# /auto-experiment + +Body text. +""" + rendered = install._build_codex_skill_text(source) + + self.assertNotIn("argument-hint", rendered) + self.assertIn("$auto-experiment", rendered) + self.assertIn("# /auto-experiment", rendered) + + def test_install_and_uninstall_cover_claude_and_codex(self): + with tempfile.TemporaryDirectory() as tmp: + root = Path(tmp) + repo = root / "repo" + claude_dir = root / ".claude" + codex_dir = root / ".codex" + + (repo / "skills" / "auto-experiment" / "agents").mkdir(parents=True) + (repo / "core").mkdir(parents=True) + (repo / "gpu").mkdir(parents=True) + + (repo / "skills" / "auto-experiment" / "SKILL.md").write_text( + """--- +name: auto-experiment +description: "Launch experiment loop" +argument-hint: "[--project ]" +--- + +# /auto-experiment + +Body text. +""" + ) + (repo / "skills" / "auto-experiment" / "agents" / "openai.yaml").write_text( + 'interface:\n display_name: "Auto Experiment"\n' + ) + (repo / "core" / "loop.py").write_text("print('core')\n") + (repo / "gpu" / "detect.py").write_text("print('gpu')\n") + (repo / "config.yaml").write_text("agent:\n provider: anthropic\n") + + install.install(claude_dir=claude_dir, codex_dir=codex_dir, repo_dir=repo) + + claude_skill = claude_dir / "commands" / "auto-experiment.md" + codex_skill = codex_dir / "skills" / "auto-experiment" / "SKILL.md" + codex_ui_meta = codex_dir / "skills" / "auto-experiment" / "agents" / "openai.yaml" + codex_runtime = codex_dir / "deep-researcher" / "core" / "loop.py" + + self.assertTrue(claude_skill.exists()) + self.assertIn("argument-hint", claude_skill.read_text()) + + self.assertTrue(codex_skill.exists()) + self.assertNotIn("argument-hint", codex_skill.read_text()) + self.assertIn("$auto-experiment", codex_skill.read_text()) + + self.assertTrue(codex_ui_meta.exists()) + self.assertTrue(codex_runtime.exists()) + + install.uninstall(claude_dir=claude_dir, codex_dir=codex_dir, repo_dir=repo) + + self.assertFalse(claude_skill.exists()) + self.assertFalse((codex_dir / "skills" / "auto-experiment").exists()) + self.assertFalse((codex_dir / "deep-researcher").exists()) + + def test_install_refuses_to_overwrite_unowned_codex_skill(self): + with tempfile.TemporaryDirectory() as tmp: + root = Path(tmp) + repo = root / "repo" + claude_dir = root / ".claude" + codex_dir = root / ".codex" + + (repo / "skills" / "auto-experiment").mkdir(parents=True) + (repo / "core").mkdir(parents=True) + (repo / "gpu").mkdir(parents=True) + (repo / "skills" / "auto-experiment" / "SKILL.md").write_text( + """--- +name: auto-experiment +description: "Launch experiment loop" +--- + +Body. +""" + ) + (repo / "config.yaml").write_text("agent:\n provider: anthropic\n") + foreign_skill = codex_dir / "skills" / "auto-experiment" + foreign_skill.mkdir(parents=True) + (foreign_skill / "SKILL.md").write_text("foreign\n") + + with self.assertRaises(RuntimeError): + install.install(claude_dir=claude_dir, codex_dir=codex_dir, repo_dir=repo) + + self.assertFalse((claude_dir / "commands" / "auto-experiment.md").exists()) + + +if __name__ == "__main__": + unittest.main() diff --git a/tests/test_provider_config.py b/tests/test_provider_config.py new file mode 100644 index 0000000..669df0f --- /dev/null +++ b/tests/test_provider_config.py @@ -0,0 +1,143 @@ +import os +import tempfile +import types +import unittest +from unittest.mock import MagicMock, patch + +from core.agents import AgentDispatcher +from core.loop import ResearchLoop + + +class _OpenAIResponse: + def __init__(self, content: str): + self.choices = [ + types.SimpleNamespace( + message=types.SimpleNamespace(content=content) + ) + ] + + +class _AnthropicResponse: + def __init__(self, content: str): + self.content = [types.SimpleNamespace(text=content)] + + +class CompatibleProviderConfigTests(unittest.TestCase): + def test_openai_compatible_provider_passes_base_url_and_custom_key_env(self): + create = MagicMock(return_value=_OpenAIResponse("qwen ok")) + client = types.SimpleNamespace( + chat=types.SimpleNamespace( + completions=types.SimpleNamespace(create=create) + ) + ) + ctor = MagicMock(return_value=client) + fake_openai = types.SimpleNamespace(OpenAI=ctor) + + with patch.dict("sys.modules", {"openai": fake_openai}): + with patch.dict(os.environ, {"DASHSCOPE_API_KEY": "secret-key"}, clear=False): + dispatcher = AgentDispatcher( + provider="openai", + model="qwen-plus", + base_url="https://dashscope.aliyuncs.com/compatible-mode/v1", + api_key_env="DASHSCOPE_API_KEY", + ) + result = dispatcher._call_openai( + "system prompt", + [{"role": "user", "content": "hello"}], + ) + + ctor.assert_called_once_with( + api_key="secret-key", + base_url="https://dashscope.aliyuncs.com/compatible-mode/v1", + ) + create.assert_called_once() + self.assertEqual(create.call_args.kwargs["model"], "qwen-plus") + self.assertEqual(result, "qwen ok") + + def test_anthropic_compatible_provider_passes_base_url_and_auth(self): + create = MagicMock(return_value=_AnthropicResponse("minimax ok")) + client = types.SimpleNamespace( + messages=types.SimpleNamespace(create=create) + ) + ctor = MagicMock(return_value=client) + fake_anthropic = types.SimpleNamespace(Anthropic=ctor) + + with patch.dict("sys.modules", {"anthropic": fake_anthropic}): + with patch.dict( + os.environ, + { + "MINIMAX_API_KEY": "secret-key", + "MINIMAX_AUTH_TOKEN": "secret-token", + }, + clear=False, + ): + dispatcher = AgentDispatcher( + provider="anthropic", + model="MiniMax-M2.1", + base_url="https://api.minimaxi.com/anthropic", + api_key_env="MINIMAX_API_KEY", + auth_token_env="MINIMAX_AUTH_TOKEN", + ) + result = dispatcher._call_anthropic( + "system prompt", + [{"role": "user", "content": "hello"}], + ) + + ctor.assert_called_once_with( + api_key="secret-key", + auth_token="secret-token", + base_url="https://api.minimaxi.com/anthropic", + ) + create.assert_called_once() + self.assertEqual(create.call_args.kwargs["model"], "MiniMax-M2.1") + self.assertEqual(result, "minimax ok") + + +class ResearchLoopProviderConfigTests(unittest.TestCase): + @patch("core.loop.AgentDispatcher") + @patch("core.loop.ToolRegistry") + @patch("core.loop.ObsidianExporter") + @patch("core.loop.ExperimentMonitor") + @patch("core.loop.MemoryManager") + @patch("core.loop.build_execution_backend") + def test_loop_passes_compatible_provider_config( + self, + build_backend_mock, + _memory_mock, + _monitor_mock, + _obsidian_mock, + _tool_registry_mock, + dispatcher_mock, + ): + backend = MagicMock() + build_backend_mock.return_value = backend + + with tempfile.TemporaryDirectory() as tmp: + ResearchLoop( + config={ + "project": {"workspace": "workspace"}, + "agent": { + "provider": "openai", + "model": "glm-4.5", + "base_url": "https://open.bigmodel.cn/api/paas/v4", + "api_key_env": "ZHIPUAI_API_KEY", + "auth_token_env": "", + "max_steps_per_cycle": 5, + }, + }, + project_dir=tmp, + ) + + dispatcher_mock.assert_called_once_with( + model="glm-4.5", + provider="openai", + max_steps=5, + base_url="https://open.bigmodel.cn/api/paas/v4", + api_key_env="ZHIPUAI_API_KEY", + auth_token_env="", + ) + backend.validate.assert_called_once_with() + + +if __name__ == "__main__": + unittest.main() diff --git a/tests/test_skills.py b/tests/test_skills.py new file mode 100644 index 0000000..ac4a8dc --- /dev/null +++ b/tests/test_skills.py @@ -0,0 +1,50 @@ +import re +import unittest +from pathlib import Path + +import yaml + + +REPO_ROOT = Path(__file__).resolve().parent.parent +ALLOWED_FRONTMATTER_KEYS = {"name", "description", "license", "allowed-tools", "metadata"} + + +class SkillValidationTests(unittest.TestCase): + def test_all_repo_skills_use_codex_compatible_frontmatter(self): + failures = [] + skills_dir = REPO_ROOT / "skills" + + for skill_dir in sorted(skills_dir.iterdir()): + if not skill_dir.is_dir(): + continue + skill_md = skill_dir / "SKILL.md" + if not skill_md.exists(): + failures.append(f"{skill_dir.name}: missing SKILL.md") + continue + + text = skill_md.read_text() + match = re.match(r"^---\n(.*?)\n---\n?", text, re.DOTALL) + if not match: + failures.append(f"{skill_dir.name}: invalid or missing YAML frontmatter") + continue + + frontmatter = yaml.safe_load(match.group(1)) or {} + if not isinstance(frontmatter, dict): + failures.append(f"{skill_dir.name}: frontmatter is not a YAML dictionary") + continue + + unexpected = sorted(set(frontmatter.keys()) - ALLOWED_FRONTMATTER_KEYS) + if unexpected: + failures.append( + f"{skill_dir.name}: unexpected keys {', '.join(unexpected)}" + ) + + self.assertEqual( + failures, + [], + msg="Repo skills must keep Codex-compatible frontmatter:\n" + "\n".join(failures), + ) + + +if __name__ == "__main__": + unittest.main()