aistackregistry.com - notes/readme
public docs
README
Source: README.md.
Other Versions
Source Text
# AI Stack Registry AI Stack Registry publishes dated files for package compatibility and AI model defaults. Use JSON when you want data. Use Markdown when you want to read it. Snapshots include checksums, signatures when configured, and source links. ## Status - This repository is private for now while changes settle. - The current public contract version is `0.1.2`. ## Why this exists Model IDs, SDK defaults, package constraints, and docs move fast. This repo gives tools a current place to check during a run, with dated snapshots and source links. ## What you get - **Tracked stacks** with explicit priority tiers for compatibility checks. - **Latest compatible package sets** under the configured Python baseline (`policy/registry.yaml` -> `python_version`), with a report that explains why pins exist. - **Model specs and recommended defaults** for Gemini, Anthropic, OpenAI, and xAI, with defaults from docs and token limits/modalities from provider APIs. - **Stable provider paths** for public model files, for example `google` policies publish under `/models/google/`. - **Fetchable files**: JSON, Markdown, plain text, and `llms.txt`. - **Audit files** with checksums and cosign signatures when available. ## Assumptions - Linux, x86_64 marker environment is used when evaluating `requires_dist` markers for the configured Python baseline (`policy/registry.yaml` -> `python_version`). - `uv` is available in CI for dependency resolution (`uv pip compile`). - `cosign` is available in CI for keyless (OIDC) signing; key-based signing happens only when `COSIGN_KEY` is explicitly set. - Provider API keys (for example, `GEMINI_API_KEY`, `ANTHROPIC_API_KEY`, `OPENAI_API_KEY`, and `XAI_API_KEY`) are available in CI for fetching model metadata. - The default stack is `google-ai-agents`. - Network access is required to fetch upstream sources (PyPI, model APIs/docs, raw GitHub overlays). - Repo overlays are available via raw GitHub URLs; CI can export `GITHUB_OVERLAY_TOKEN` from a configured GitHub App for raw GitHub overlay fetches. - Local secrets, when needed for human-only bootstrap or debugging, are stored in `data/secrets/.env` (gitignored). They are not valid validation or publishing evidence. - Snapshot timestamps use ISO 8601 UTC, for example `2025-01-01T00:00:00+00:00`. ## Sources Outputs come from listed upstream sources, with citations embedded in artifacts and docs: - PyPI JSON API (e.g., https://pypi.org/pypi/google-adk/json) - pip constraints files semantics: https://pip.pypa.io/en/stable/user_guide/#constraints-files - uv resolver (pip compile): https://docs.astral.sh/uv/pip/compile/ - Python release index (baseline drift source): https://www.python.org/downloads/ - Model API endpoints and docs listed in `policy/models/*.yaml` (for example, Gemini API models list/get: https://generativelanguage.googleapis.com/v1beta/models). - Provider model docs (for example, Gemini model cards, thinking defaults, media resolution). - SDK release sources listed in `policy/models/*.yaml` (for example, python-genai releases). - llms.txt spec: https://llmstxt.org/ - Sigstore/cosign docs: https://docs.sigstore.dev/ ## Repo structure - `policy/`: tracked stacks, repos, models, and registry configuration - `scripts/`: fetch, resolve, build, sign, publish - `schemas/`: JSON schemas for artifacts - `site/`: landing page source (static HTML/CSS) - `templates/`: source templates for rendered repo-facing docs - `public/`: static output for the Cloudflare Pages direct-upload deploy - `examples/`: example repo overlay file - `tests/`: unit tests Retained snapshots and public doc evidence live in Cloudflare R2 under the explicit `retained_state` contract in `policy/registry.yaml`, including manifest-backed bundle artifacts for deterministic restore. Cloudflare Pages is the active production host. `main` uploads `public/` to the production Pages project `aistackregistry`, and GitHub Actions uses the dispatch-only `cloudflare-pages-production-cutover.yml` workflow for read-only checks of the production custom-domain contract. Any one-time bootstrap or repair of Pages domains or DNS remains a local manual step. Non-main validation uploads `public/` to the staging Pages project `aistackregistry-staging` on the exact branch name, which keeps previews out of the production Pages project and production custom-domain path. ## How it works ### Dependency compatibility - Stacks define **priority tiers** (highest to lowest). - Curated package entries accept PEP 508 requirement strings (e.g., `google-adk[a2a]`); pins and reports are keyed by the project name. - For each tier, `uv pip compile` resolves the latest compatible versions under the configured Python baseline (`policy/registry.yaml` → `python_version`), while **pinning higher tiers** to previously resolved versions. - The final output includes `constraints.txt`, `constraints.json`, and `compat_report.json`. - `compat_report.json` explains why a package is not latest (e.g., `google-adk` pinning `fastapi<0.124`) using pinned-version `requires_dist` from PyPI. ### Python baseline drift guard - `scripts/check_python_version_drift.py` is a fail-fast guard that enforces baseline consistency across `policy/registry.yaml`, workflow runtime pins, and stack policy pins. - The guard fetches `https://www.python.org/downloads/` (configured as `source_urls.python_releases`) and compares the latest `3.14.x` patch against the pinned registry baseline. - Workflow: `.github/workflows/python-version-drift.yml` runs on a daily schedule and `workflow_dispatch`. - On drift or pin mismatch, the workflow fails with an explicit error; no fallback behavior is allowed. - Python maintenance releases are a freshness input, not an optional cleanup task. Treat a new `3.14.x` patch like other upstream updates. - Python baseline cutovers must update the registry baseline, stack pins, every `actions/setup-python` pin enforced by the guard, and the published constraints/site surfaces that embed the baseline version. - Because Python baseline changes alter published artifact paths, validation evidence must include SHA-matched `ci.yml` and `daily.yml` runs on the PR head commit and again on the merge commit. ### Ecosystem packages (metadata only) - Stacks may also list `ecosystem_packages` (for example, `npm` packages or `go` modules) for discovery. - These entries are surfaced in `index.json` under each stack and are **not** part of the Python constraints resolution pipeline. ### Repo overlays - Repos can add dependencies or adjust tier placement with an overlay file (see `examples/ai-stack.yaml`). - `policy/repos.yaml` lists repos and their raw overlay URLs. - `daily.yml` exports `GITHUB_OVERLAY_TOKEN` from a repo-scoped, contents-read GitHub App token when `OVERLAY_APP_ID` and `OVERLAY_APP_PRIVATE_KEY` are configured. - The current `policy/repos.yaml` entry points at a public raw GitHub overlay in this repository, so this repo's GitHub Actions evidence does not yet prove private external overlay retrieval end to end. ### Model registry - `scripts/fetch_models.py` pulls provider model metadata for every policy in `policy/models/*.yaml` (uses provider-specific API keys). - `scripts/snapshot_docs.py` captures docs from policy URLs, recording both raw snapshot hashes and normalized content hashes for auditability. - `scripts/build_models.py` merges API data + policy-backed control metadata into `spec.json` and `recommended_defaults.json`, emitting `/models/<provider>/<model_id>/...`. - Public model paths use canonical provider names (for example, `google` policies publish under `/models/google/`); the `provider` field in payloads remains canonical. - Docs-backed control verification must parse provider-authored model names, tables, and prose from policy-listed sources. Do not key extractors to transient documentation UI component names; if provider docs drift, keep the failing evidence and update the policy extractor or policy value explicitly. ### Files tools can fetch - Root `llms.txt` and per-snapshot `llms.txt` list stable artifact URLs. - `index.json` enumerates stacks, repos, models, and latest snapshot metadata. - Model lookup endpoints include explicit `lookup` URL maps and URI templates. Fetch the emitted URLs instead of guessing sibling paths. - Every published HTML page emits `index.md`, `index.txt`, and `index.json` variants (discoverable via `rel="alternate"`, `llms.txt`, and `sitemap.xml`). - Model lookup indexes (`/models/index.json`, `/models/providers/index.json`, `/models/<provider>/index.json`) also emit `index.md` and `index.txt`. - `/latest/` is an alias to the most recent snapshot for stable links. ## Validation and publishing Validation and publishing happen only in GitHub Actions. Use `ci.yml` for tests and fixture validation. Use `daily.yml` for snapshot generation, provider verification, artifact validation, retained-state sync, signing, and Cloudflare Pages upload. Local script execution is debug-only and is not accepted as validation or publishing evidence. Missing required provider keys or secrets must fail in CI. Do not use missing-key skips or partial local output as evidence for a PR. ## Checking the files - Checksums and signatures are published for audit checks. - Public-safe doc evidence is published with each snapshot. ## What you can check - Dated snapshots with checksums and source citations. - Compatibility constraints with clear `blocked_by` evidence. - Model defaults/specs derived from provider APIs and docs (no inference from training data). ## License Apache-2.0. See `LICENSE`.