# HyperAgents

## Docs

- [Working with Experiment Logs](https://mintlify.wiki/facebookresearch/HyperAgents/analysis/experiment-logs.md): How to extract the published experiment archives, understand the metadata and archive formats, read evaluation scores programmatically, and interpret the parent-selection logic.
- [Output Directory Structure](https://mintlify.wiki/facebookresearch/HyperAgents/analysis/outputs.md): Reference for the file and directory layout produced by a HyperAgents generate_loop run, including per-generation artifacts, evaluation outputs, and the run-level archive.
- [Analysis & Visualization](https://mintlify.wiki/facebookresearch/HyperAgents/analysis/visualization.md): Reference for HyperAgents' plotting utilities: progress curves, archive tree visualizations, multi-run comparison plots, and how to run them manually after a completed experiment.
- [AgentSystem](https://mintlify.wiki/facebookresearch/HyperAgents/api/agent-system.md): Abstract base class for all HyperAgents agents. Provides shared model configuration and thread-safe logging.
- [Docker Utilities](https://mintlify.wiki/facebookresearch/HyperAgents/api/docker-utils.md): Functions for building, managing, and interacting with Docker containers used to isolate agent evaluation runs.
- [Domain Utilities](https://mintlify.wiki/facebookresearch/HyperAgents/api/domain-utils.md): Helper functions for querying domain-specific evaluation configuration — splits, score keys, subset names, and staged evaluation parameters.
- [Generate Loop Utilities](https://mintlify.wiki/facebookresearch/HyperAgents/api/generate-loop.md): Core utilities for the evolutionary generate loop — archive management, parent selection, scoring, and container patch application.
- [LLM Interface](https://mintlify.wiki/facebookresearch/HyperAgents/api/llm.md): Core LLM API functions for making completions requests and running tool-augmented agent conversations.
- [MetaAgent](https://mintlify.wiki/facebookresearch/HyperAgents/api/meta-agent.md): A self-improving agent that modifies the HyperAgents codebase. Extends AgentSystem with full tool access.
- [TaskAgent](https://mintlify.wiki/facebookresearch/HyperAgents/api/task-agent.md): An agent that solves a single domain task and returns a structured JSON prediction. Loaded and run in parallel by the evaluation harness.
- [Agent Tools](https://mintlify.wiki/facebookresearch/HyperAgents/api/tools.md): The tool system that agents use to interact with the environment, including shell execution and file editing.
- [System Architecture](https://mintlify.wiki/facebookresearch/HyperAgents/concepts/architecture.md): How HyperAgents orchestrates a two-agent hierarchy, an evolutionary archive, and Docker sandboxes to iteratively self-improve.
- [Evolution Loop](https://mintlify.wiki/facebookresearch/HyperAgents/concepts/evolution-loop.md): Reference for generate_loop() — parameters, parent selection strategies, staged evaluation, archive management, and resume logic.
- [Meta-Agent](https://mintlify.wiki/facebookresearch/HyperAgents/concepts/meta-agent.md): Deep dive into MetaAgent — the self-improving agent that edits the HyperAgents codebase between generations.
- [Task-Agent](https://mintlify.wiki/facebookresearch/HyperAgents/concepts/task-agent.md): Deep dive into TaskAgent — the agent that solves domain tasks and whose code the meta-agent iteratively improves.
- [Baselines](https://mintlify.wiki/facebookresearch/HyperAgents/configuration/baselines.md): Reference for the comparison baselines available via --run_baseline in generate_loop.py.
- [Docker Configuration](https://mintlify.wiki/facebookresearch/HyperAgents/configuration/docker.md): Understand how HyperAgents builds and manages Docker containers for sandboxed agent generation and evaluation.
- [Environment Setup](https://mintlify.wiki/facebookresearch/HyperAgents/configuration/environment.md): Configure API keys, system dependencies, Python environment, and LLM models for HyperAgents.
- [Generate Loop Reference](https://mintlify.wiki/facebookresearch/HyperAgents/configuration/generate-loop.md): Complete CLI reference for generate_loop.py, the main entry point for running the HyperAgents self-improvement algorithm.
- [BALROG](https://mintlify.wiki/facebookresearch/HyperAgents/domains/balrog.md): Evaluate game-playing agents across four NetHack-family environments using the BALROG benchmark.
- [Genesis](https://mintlify.wiki/facebookresearch/HyperAgents/domains/genesis.md): Evaluate agents that write reward functions for RL-based robotic locomotion control of the Unitree Go2 quadruped.
- [IMO](https://mintlify.wiki/facebookresearch/HyperAgents/domains/imo.md): Two International Mathematical Olympiad benchmark domains — grading student answers and generating full proofs.
- [Domains Overview](https://mintlify.wiki/facebookresearch/HyperAgents/domains/overview.md): All benchmark domains supported by HyperAgents, their scoring metrics, dataset splits, and how to add a new domain.
- [Paper Review](https://mintlify.wiki/facebookresearch/HyperAgents/domains/paper-review.md): Evaluate agent ability to predict academic paper review outcomes (accept vs. reject) from full paper text.
- [Polyglot](https://mintlify.wiki/facebookresearch/HyperAgents/domains/polyglot.md): SWE-bench-style multi-language coding benchmark that evaluates agents in per-instance Docker containers across six programming languages.
- [Search Arena](https://mintlify.wiki/facebookresearch/HyperAgents/domains/search-arena.md): Evaluate agent ability to judge which of two web-search responses better answers a user query.
- [Introduction](https://mintlify.wiki/facebookresearch/HyperAgents/introduction.md): HyperAgents: self-referential self-improving agents that can optimize for any computable task.
- [Quickstart](https://mintlify.wiki/facebookresearch/HyperAgents/quickstart.md): Install HyperAgents, build the Docker sandbox, and run your first evolution loop.
- [Safety](https://mintlify.wiki/facebookresearch/HyperAgents/safety.md): Understand the risks of executing model-generated code and how to run HyperAgents safely.