perseus¶
LLM relay. Chains Ollama (local) → OllamaFreeAPI → Gemini → OpenAI-compat.
Pure httpx; no pydantic, no Rust dep — works on Python 3.15.
Source: llamaclaw/perseus.
Perseus — LLM relay for the llamaclaw ecosystem.
- perseus.agent_available() bool[source]¶
Return True when at least one live LLM provider is available.
- Returns:
Trueif a live provider is detected,Falseif only local fallback is available.- Return type:
Examples
>>> isinstance(agent_available(), bool) True
- perseus.ask_percy(question: str, *, context: str | None = None, model: str | None = None, system_prompt: str = 'You are Perseus, the ESML agent for epidemiological semiparametric machine learning.\nHelp users understand datasets, methods, debugging steps, testing strategy, and interpretation.\nBe explicit about assumptions, limitations, missing data concerns, and reproducibility risks.\nDo not invent data access or approval status for restricted datasets.', allow_fallback: bool = True, stream: bool = False, use_agent: bool = True) dict[str, Any][source]¶
Query Perseus via the LLM provider chain.
When
use_agent=True(default) and Ollama is available, Perseus uses the full agentic loop with 13 tools (search, execute, read/write, shell, data). Falls back to simple LLM chat or static text when tools are unavailable.Returns a dict with
mode,model, and eitheroutput_text(str) oroutput_stream(Iterator[str]).
- perseus.build_prompt(question: str, context: str | None = None) str[source]¶
Build a prompt from a user question and optional context.
Pure-httpx OllamaFreeAPI client — no pydantic/Rust dependencies.
Drop-in replacement for the ollamafreeapi pip package. Uses the same
bundled JSON model registry and the standard Ollama /api/generate
HTTP endpoint, but relies only on httpx (already a core dependency)
instead of the ollama SDK (which pulls in pydantic-core via Rust/PyO3).
This allows ESML to run on Python 3.15+ where PyO3 doesn’t yet have pre-built wheels.
- class perseus.fam.OllamaFreeAPI[source]¶
Bases:
objectLightweight client for free community Ollama servers.
Ollama-first LLM integration layer for the ESML package.
Provides a provider chain that attempts local Ollama inference first, then OllamaFreeAPI (free remote models, no API key), then Gemini (Google), then a generic OpenAI-compatible endpoint (e.g. Qwen via OpenRouter, GPT-OSS models via Together/Groq), then the official OpenAI API, and finally a local help-text fallback that requires no network access.
HTTP-based providers use httpx against OpenAI-compatible endpoints.
OllamaFreeAPI uses its own Python SDK (ollamafreeapi) for free remote
model access without any API key.
Environment Variables¶
- OLLAMA_BASE_URLstr
Base URL for a running Ollama instance. Default:
http://localhost:11434- esmlfamstr
Override the OllamaFreeAPI model (esml free api model). Default:
mistral-nemo:custom.- GEMINI_API_KEYstr
Google AI Studio API key. Free-tier keys work for development. Model defaults to
gemini-2.0-flash.- GEMINI_MODELstr
Override the Gemini model (e.g.
gemini-1.5-pro). Optional.- LLM_API_BASE_URLstr
Base URL for any OpenAI-compatible API (e.g., OpenRouter, Together, Groq). Use this to point at Qwen, Mistral, GPT-OSS, or any hosted model.
- LLM_API_KEYstr
API key for the endpoint at
LLM_API_BASE_URL.- OPENAI_API_KEYstr
API key for the official OpenAI API at
https://api.openai.com.- Provider priority (auto-detected at runtime):
Ollama — local, private, no API key needed
FreeAPI — OllamaFreeAPI, free remote models, no API key
Gemini — Google AI, generous free tier
API — generic OpenAI-compatible (Qwen, GPT-OSS, Groq, etc.)
OpenAI — official OpenAI API
local — static help text, no network required
References
Ollama API docs: https://github.com/ollama/ollama/blob/main/docs/api.md
OllamaFreeAPI: https://pypi.org/project/ollamafreeapi/
Gemini OpenAI-compatible API: https://ai.google.dev/gemini-api/docs/openai
OpenAI Chat Completions API: https://platform.openai.com/docs/api-reference/chat
- perseus.llm.agent_available() bool[source]¶
Return True when at least one live LLM provider is available.
- Returns:
Trueif a live provider is detected,Falseif only local fallback is available.- Return type:
Examples
>>> isinstance(agent_available(), bool) True
- perseus.llm.ask(prompt: str, context: dict[str, Any] | None = None, *, stream: bool = False, model: str | None = None, provider: str | None = None, system_prompt: str | None = None, timeout: float = 120.0) str | Iterator[str][source]¶
Send a prompt to the best available LLM provider and return the response.
The provider chain is: Ollama (local) -> OpenAI-compatible API -> OpenAI direct -> local fallback. Each provider is tried in order; on failure the next is attempted.
- Parameters:
prompt (str) – The user’s question or instruction.
context (dict[str, Any] | None) – Optional context dictionary (e.g., from
build_esml_context()). Injected into the system prompt to give the LLM awareness of available modules, CPADS schema, and the user’s working directory.stream (bool) – If
True, return an iterator of string chunks for streaming output. IfFalse(default), return the full response as a single string.model (str | None) – Override the model identifier. When
None, a sensible default is chosen per provider.provider (str | None) – Force a specific provider (
"ollama","api","openai","local"). WhenNone,detect_available_provider()is used to auto-detect.system_prompt (str | None) – Override the entire system prompt. When
None, the standard ESML system prompt is built from thecontextparameter.timeout (float) – HTTP request timeout in seconds.
- Returns:
The LLM response text (or a streaming iterator of text chunks). When all providers fail, returns a local fallback help string.
- Return type:
Examples
>>> # Non-streaming (returns full text) >>> response = ask("What is AIPW?") >>> isinstance(response, str) True
>>> # Streaming >>> for chunk in ask("Explain TMLE", stream=True): ... print(chunk, end="")
- perseus.llm.ask_multi(messages: list[dict[str, str]], *, stream: bool = False, model: str | None = None, provider: str | None = None, timeout: float = 120.0) str | Iterator[str][source]¶
Send a pre-built messages array to the best available LLM provider.
Unlike
ask(), this accepts the fullmessagesarray directly, enabling multi-turn conversation support. The caller is responsible for constructing the system and user messages.- Parameters:
messages (list[dict[str, str]]) – The chat messages array (system, user, assistant turns).
stream (bool) – If
True, return an iterator of string chunks.model (str | None) – Override the model identifier.
provider (str | None) – Force a specific provider. Auto-detected when
None.timeout (float) – HTTP request timeout in seconds.
- Returns:
The LLM response text (or a streaming iterator).
- Return type:
- perseus.llm.assistant_available() bool¶
Return True when at least one live LLM provider is available.
- Returns:
Trueif a live provider is detected,Falseif only local fallback is available.- Return type:
Examples
>>> isinstance(agent_available(), bool) True
- perseus.llm.build_esml_context(repo_root: str | Path | None = None) dict[str, Any][source]¶
Build an LLM-friendly context dictionary from the ESML package state.
The returned dictionary is designed to be injected into the system prompt so the LLM is aware of the available modules, the CPADS data contract, and the current working directory.
- Parameters:
repo_root (str | Path | None) – Path to the ESML repository root. When
Nonethe function attempts to resolve the root from this file’s location.- Returns:
A dictionary with keys:
module_list– list of module name/description pairs.cpads_schema– the CPADS data contract dictionary.cwd– the current working directory as a string.repo_root– the resolved repository root, or"unknown".
- Return type:
Examples
>>> ctx = build_esml_context() >>> "module_list" in ctx and "cpads_schema" in ctx True
- perseus.llm.detect_available_provider() str[source]¶
Detect which LLM provider is currently available.
The detection order mirrors the provider chain priority:
ollama – a local Ollama instance is reachable (probed via HTTP).
freeapi –
ollamafreeapipackage is installed and servers respond.gemini –
GEMINI_API_KEYis set.api –
LLM_API_BASE_URLandLLM_API_KEYare set.openai –
OPENAI_API_KEYis set.local – no live provider; ESML will return static help text.
- Returns:
One of
"ollama","freeapi","gemini","api","openai", or"local".- Return type:
Examples
>>> provider = detect_available_provider() >>> provider in ("ollama", "freeapi", "gemini", "api", "openai", "local") True
- perseus.llm.detect_model_display() dict[str, str][source]¶
Return display info with inner (family:size) and outer (model name).
- perseus.llm.detect_provider_and_model() tuple[str, str][source]¶
Detect LLM provider and return (provider, human-readable model label).
- perseus.llm.get_last_traceback() str[source]¶
Return the last Python traceback, if any, for error-context injection.
- perseus.llm.list_freeapi_models() list[dict[str, str]][source]¶
List all available OllamaFreeAPI models from vendored JSONs.
- perseus.llm.pick_thinking_word(query: str) str[source]¶
Pick a context-aware thinking word based on the query, or a random one.
Local Ollama client for ESML.
Pure httpx-based client for a local Ollama instance running at
localhost:11434. Provides model management (pull, list, remove),
chat, and streaming — no external deps beyond httpx.
This module backs the ollama provider slot in esml.llm.
Environment Variables¶
- OLLAMA_BASE_URLstr
Override the Ollama endpoint. Default:
http://localhost:11434.- ESML_OLLAMA_MODELstr
Override the default local model. Default:
gemma4:e2b.
- class perseus.loc.LocalOllama(base_url: str | None = None, model: str | None = None, timeout: float = 300.0)[source]¶
Bases:
objectClient for a local Ollama instance.
- Parameters:
Examples
>>> client = LocalOllama() >>> client.is_running() True >>> models = client.list_models() >>> response = client.chat("What is IPW?")
- chat(prompt: str, *, model: str | None = None, system: str | None = None, context: list[dict[str, str]] | None = None, temperature: float = 0.1, num_predict: int = 4096) str[source]¶
Send a chat message and return the full response.
- Parameters:
- Returns:
The assistant’s response text.
- Return type:
- generate(prompt: str, *, model: str | None = None, system: str | None = None, stream: bool = False, temperature: float = 0.1, num_predict: int = 4096) str | Iterator[str][source]¶
Raw generation endpoint (non-chat). Returns full text or stream.
- pull(name: str, *, stream: bool = True, timeout: float = 600.0) Iterator[dict[str, Any]] | dict[str, Any][source]¶
Pull (download) a model.
- class perseus.loc.ModelInfo(name: str, size: int = 0, parameter_size: str = '', family: str = '', quantization: str = '', modified_at: str = '')[source]¶
Bases:
objectMetadata for a locally available Ollama model.
Perseus — the ESML resident AI agent.
Delegates to the provider-chain in esml.llm. The LLM module handles
Ollama, Gemini, OpenAI-compatible APIs, direct OpenAI, and a local static
fallback.
When stream=True is passed to ask_percy(), the returned dictionary
contains an output_stream key (an iterator of string chunks) instead of
output_text.
- perseus.perseus.ask_esml_assistant(question: str, *, context: str | None = None, model: str | None = None, system_prompt: str = 'You are Perseus, the ESML agent for epidemiological semiparametric machine learning.\nHelp users understand datasets, methods, debugging steps, testing strategy, and interpretation.\nBe explicit about assumptions, limitations, missing data concerns, and reproducibility risks.\nDo not invent data access or approval status for restricted datasets.', allow_fallback: bool = True, stream: bool = False, use_agent: bool = True) dict[str, Any]¶
Query Perseus via the LLM provider chain.
When
use_agent=True(default) and Ollama is available, Perseus uses the full agentic loop with 13 tools (search, execute, read/write, shell, data). Falls back to simple LLM chat or static text when tools are unavailable.Returns a dict with
mode,model, and eitheroutput_text(str) oroutput_stream(Iterator[str]).
- perseus.perseus.ask_percy(question: str, *, context: str | None = None, model: str | None = None, system_prompt: str = 'You are Perseus, the ESML agent for epidemiological semiparametric machine learning.\nHelp users understand datasets, methods, debugging steps, testing strategy, and interpretation.\nBe explicit about assumptions, limitations, missing data concerns, and reproducibility risks.\nDo not invent data access or approval status for restricted datasets.', allow_fallback: bool = True, stream: bool = False, use_agent: bool = True) dict[str, Any][source]¶
Query Perseus via the LLM provider chain.
When
use_agent=True(default) and Ollama is available, Perseus uses the full agentic loop with 13 tools (search, execute, read/write, shell, data). Falls back to simple LLM chat or static text when tools are unavailable.Returns a dict with
mode,model, and eitheroutput_text(str) oroutput_stream(Iterator[str]).
- perseus.perseus.build_assistant_prompt(question: str, context: str | None = None) str¶
Build a prompt from a user question and optional context.
- perseus.perseus.build_prompt(question: str, context: str | None = None) str[source]¶
Build a prompt from a user question and optional context.
Perseus Relay — serve Perseus as a cloud API endpoint.
Run on Pi (or any machine with Ollama) to let remote users access Perseus with full tool-calling capabilities over the internet.
- Usage:
python -m esml.perseus_relay # default :8421 python -m esml.perseus_relay –port 9000 # custom port python -m esml.perseus_relay –token mysecret # require auth token
- Then from any machine:
esml percy –cloud https://your-server:8421 “What is Moran’s I?”
Or set PERSEUS_CLOUD_URL in .env and it auto-connects.
Security: The relay only exposes Perseus agent capabilities (search, run functions, read files within sandbox). No shell access, no filesystem writes outside the project. Optional token auth for production use.
- class perseus.perseus_relay.PerseusCloudClient(url: str, token: str | None = None)[source]¶
Bases:
objectClient for connecting to a remote Perseus relay.
- class perseus.perseus_relay.PerseusRelayHandler(request, client_address, server)[source]¶
Bases:
BaseHTTPRequestHandler- agent = None¶
- auth_token = None¶
- log_message(format, *args)[source]¶
Log an arbitrary message.
This is used by all other logging functions. Override it if you have specific logging wishes.
The first argument, FORMAT, is a format string for the message to be logged. If the format string contains any % escapes requiring parameters, they should be specified as subsequent arguments (it’s just like printf!).
The client ip and current date/time are prefixed to every message.
Unicode control characters are replaced with escaped hex before writing the output to stderr.