AI Stack Registry (aistackregistry) A public, multi-tenant registry for dependency compatibility and AI model defaults. Dated snapshots with checksums and LLM-first artifacts. Status - This repository is private for now (next few weeks) while changes continue. - The version is held at `0.1.0` during this period. Rationale LLM training cutoffs make “latest version” and “current model defaults” stale. Frontier providers and package ecosystems change frequently (new model IDs, updated limits, renamed SDKs, shifting dependency constraints). This registry publishes dated, verifiable snapshots so builders and agents can fetch current data with checksums and embedded source citations. What this provides - **Curated, multi-tenant stacks** with explicit priority tiers for compatibility resolution. - **Latest compatible sets** under Python 3.14.2, with a transparent blocking report that explains why pins exist. - **Model registry artifacts** driven by multi-provider policies (currently Gemini, Anthropic, OpenAI, and xAI), with defaults sourced from docs and token limits/modalities from provider APIs. - **Provider alias normalization** for public model paths (for example, `google` policies publish under `/models/gemini/`). - **LLM-first outputs**: JSON artifacts + `llms.txt` + a concise landing page. - **Signed provenance** with checksums (and cosign signatures when available). Assumptions - Linux, x86_64 marker environment is used when evaluating `requires_dist` markers for Python 3.14.2. - `uv` is available in CI for dependency resolution (`uv pip compile`). - `cosign` is available in CI for keyless (OIDC) signing; key-based signing happens only when `COSIGN_KEY` is explicitly set. - Local runs can skip cosign and emit checksums only unless `COSIGN_KEY` is explicitly set. - Provider API keys (for example, `GEMINI_API_KEY`, `ANTHROPIC_API_KEY`, `OPENAI_API_KEY`, and `XAI_API_KEY`) are available in CI for fetching model metadata. - The default stack is `google-ai-agents`. - Network access is required to fetch authoritative sources (PyPI, model APIs/docs, raw GitHub overlays). - Repo overlays are available via raw GitHub URLs; private overlays use a GitHub App token in CI (minted from `OVERLAY_APP_ID` + `OVERLAY_APP_PRIVATE_KEY`, exported as `GITHUB_OVERLAY_TOKEN`), and local runs can set `GITHUB_OVERLAY_TOKEN` explicitly. - Local secrets are stored in `data/secrets/.env` (gitignored) and exported before running scripts. - Snapshot-level `as_of` timestamps are provided in ISO 8601 UTC (e.g., `2025-01-01T00:00:00+00:00`). Authoritative sources All outputs are derived from authoritative sources only, with citations embedded in artifacts and docs: - PyPI JSON API (e.g., https://pypi.org/pypi/google-adk/json) - pip constraints files semantics: https://pip.pypa.io/en/stable/user_guide/#constraints-files - uv resolver (pip compile): https://docs.astral.sh/uv/pip/compile/ - Model API endpoints and docs listed in `policy/models/*.yaml` (for example, Gemini API models list/get: https://generativelanguage.googleapis.com/v1beta/models). - Provider model docs (for example, Gemini model cards, thinking defaults, media resolution). - SDK release sources listed in `policy/models/*.yaml` (for example, python-genai releases). - llms.txt spec: https://llmstxt.org/ - Sigstore/cosign docs: https://docs.sigstore.dev/ Repo structure - `policy/`: curated stacks, repos, models, and registry configuration - `scripts/`: fetch, resolve, build, sign, publish - `schemas/`: JSON schemas for artifacts - `site/`: landing page source (static HTML/CSS) - `public/`: build output (GitHub Pages) - `examples/`: example repo overlay file - `tests/`: unit tests GitHub Pages is published from the `gh-pages` branch with the `CNAME` set to `aistackregistry.com`. How it works Dependency compatibility - Stacks define **priority tiers** (highest to lowest). - Curated package entries accept PEP 508 requirement strings (e.g., `google-adk[a2a]`); pins and reports are keyed by the project name. - For each tier, `uv pip compile` resolves the latest compatible versions under Python 3.14.2, while **pinning higher tiers** to previously resolved versions. - The final output includes `constraints.txt`, `constraints.json`, and `compat_report.json`. - `compat_report.json` explains why a package is not latest (e.g., `google-adk` pinning `fastapi<0.124`) using pinned-version `requires_dist` from PyPI. Ecosystem packages (metadata only) - Stacks may also list `ecosystem_packages` (for example, `npm` packages or `go` modules) for discovery. - These entries are surfaced in `index.json` under each stack and are **not** part of the Python constraints resolution pipeline. Repo overlays - Repos can add dependencies or adjust tier placement with an overlay file (see `examples/ai-stack.yaml`). - `policy/repos.yaml` lists repos and their raw overlay URLs. - CI uses `actions/create-github-app-token` to export `GITHUB_OVERLAY_TOKEN` for private overlay fetches; local runs can set `GITHUB_OVERLAY_TOKEN` explicitly. Model registry - `scripts/fetch_models.py` pulls provider model metadata for every policy in `policy/models/*.yaml` (uses provider-specific API keys). - `scripts/snapshot_docs.py` captures doc HTML from policy URLs and computes sha256 for auditability. - `scripts/build_models.py` merges API data + curated defaults from policy into `spec.json` and `recommended_defaults.json`, emitting `/models///...`. - Public model paths normalize provider aliases (for example, `google` policies publish under `/models/gemini/`); the `provider` field in payloads remains canonical. LLM-first outputs - Root `llms.txt` and per-snapshot `llms.txt` list stable artifact URLs. - `index.json` enumerates stacks, repos, models, and latest snapshot metadata. - Every published HTML page emits `index.md`, `index.txt`, and `index.json` variants (discoverable via `rel="alternate"`, `llms.txt`, and `sitemap.xml`). - `/latest/` is an alias to the most recent snapshot for stable links. Local debugging (non-authoritative) Official validation/publishing happens via GitHub Actions only; local runs are debug-only and not accepted for PR validation. See `docs/OPERATIONS.md`. ```bash python -m venv .venv source .venv/bin/activate pip install -r requirements.txt AS_OF=$(date -u +"%Y-%m-%dT%H:%M:%S+00:00") Fetch authoritative sources python scripts/fetch_pypi.py --all-stacks python scripts/fetch_models.py --allow-missing-key python scripts/snapshot_docs.py --as-of "$AS_OF" python scripts/diff_docs.py Build and publish a snapshot python scripts/publish.py --as-of "$AS_OF" ``` Verifying provenance - Checksums: `public/provenance/checksums.json` - Cosign signatures (if configured): `public/provenance/signatures/` - Cosign bundle (keyless): `public/provenance/signatures/checksums.json.bundle` - Bundle verification (keyless CI): `cosign verify-blob --bundle public/provenance/signatures/checksums.json.bundle --certificate-identity https://github.com/hey-jj/ai-stack-registry/.github/workflows/daily.yml@refs/heads/main --certificate-oidc-issuer https://token.actions.githubusercontent.com public/provenance/checksums.json` - Doc snapshots: `public/provenance/docs/` What you can verify - Dated snapshots with checksums and embedded source citations. - Compatibility constraints with clear `blocked_by` evidence. - Model defaults/specs derived from provider APIs and docs (no inference from training data). License Apache-2.0. See `LICENSE`.