icarus data catalog
Icarus Data Catalog
Two components: the extraction pipeline (icarus/data-catalog, ID 5) that generates
game data from data.pak, and the catalog viewer (icarus/catalog, ID 11) that
provides the browsable web UI at /catalog.
Project Details
Data Pipeline (icarus/data-catalog)
| Field | Value |
|---|---|
| GitLab Project | icarus/data-catalog |
| GitLab Project ID | 5 |
| Repository | https://git.eurekaendeavors.com/icarus/data-catalog |
| Visibility | private |
| Deployment | Sibling repo on server (NOT a submodule) |
| Status | ✅ Active — 287 tables extracted |
Catalog Viewer (icarus/catalog)
| Field | Value |
|---|---|
| GitLab Project | icarus/catalog |
| GitLab Project ID | 11 |
| Repository | https://git.eurekaendeavors.com/icarus/catalog |
| Visibility | private |
| Package name | icarus-catalog |
| Portal route | /catalog |
| Deployment | Submodule in portal at icarus-catalog/ |
| Status | ✅ Live |
What It Does
Extraction Pipeline
The data catalog extracts every data table from the game's data.pak file and generates
structured JSON output plus game asset icons.
Pipeline scripts (in icarus/data-catalog):
- pak_parse.py — Extract raw data from data.pak
- pak_decompress.py — Decompress extracted blobs
- catalog_builder.py — Build structured JSON tables + summary index
- pak_diff.py — Compare versions, generate patch changelogs
- icon_extract_pipeline.py — Extract and convert game icons to WebP
Output:
- catalog/tables/ — 287 individual table JSON files
- catalog/catalog_summary.json — Category index with field metadata (~1.5 MB)
- catalog/changelog_latest.json — Patch diff data
- icons/web/ — Game asset icons (64px WebP for items, 128px for talents)
- wiki/ — Auto-generated per-table markdown docs (pushed to GitLab wiki)
- extracted/ — Organized by category (AI, Bestiary, Talents, Items, etc.)
Catalog Viewer
The web UI that lets users browse the extracted data:
- Browse all 287 game data tables across 76+ categories
- Search and filter entries within categories
- View patch diffs — what changed between game versions
- Read-only — no save editing, no sessions
Key Data Tables
| Table | Used By | Contents |
|---|---|---|
D_Talents.json |
Prospector | Talent definitions, prerequisites, stats |
D_TalentTrees.json |
Prospector | Tree structure, archetypes |
D_TalentRanks.json |
Prospector | Rank requirements, costs |
D_ItemsStatic.json |
Prospector | Item database (names, weights, stack sizes) |
D_ItemTemplate.json |
Prospector | Item crafting templates |
D_Mounts.json |
Pets | Mount saddle data (NOT phenotype data despite the name) |
D_Tames.json |
Pets | Tameable creature definitions |
D_MissionNPC.json |
Pets | Creature variations/phenotypes (authoritative source) |
D_BestiaryData.json |
Pets | Creature type metadata |
D_GeneticLineages.json |
Pets | Breeding/genetics data |
D_Quests.json |
Prospects | Mission definitions |
D_GreatHunts.json |
Prospects | Great Hunts chain data |
D_WorkshopItems.json |
Prospector | Workshop item categories (297 items, 22 categories) |
D_MapIcons.json |
Cartographer (future) | 126 map icon definitions |
Icon Pipeline
Game asset icons are extracted and converted for web display:
| Type | Size | Format | Coverage |
|---|---|---|---|
| Item icons | 64px | WebP | 97% (multi-stage mapping) |
| Talent icons | 128px | WebP | 100% |
| Nav icons | Various | WebP | Staged (#120) |
Icon mapping uses a multi-stage strategy: direct prefix → display name → stripped key → fuzzy match.
Why Two Repos?
The data catalog (icarus/data-catalog) contains generated data that is gitignored —
tables and icons are produced by the extraction pipeline and not committed to git. This
means it cannot work as a submodule (submodules carry only committed content).
The catalog viewer (icarus/catalog) is a normal pip package with committed code, so it
follows the standard submodule pattern.
On the server, icarus/data-catalog is deployed as a sibling directory alongside the
portal, and the catalog viewer reads from it at runtime.
Tools
repakbinary attools/repak/repak.exe— can unpack Icarus content paks- Old
pak_variation_extract.py(regex against raw bins) superseded by readingD_MissionNPC.jsondirectly