metrics

Metrics & Analytics

The portal captures every meaningful user action with full fidelity and zero third-party data sharing. Analytics is split into three layers:

Custom SQLite metrics — structured event data, fully under our control
Umami web analytics — self-hosted page-level and custom event analytics
Internal dashboard — aggregated view of all of the above at /metrics

M1 — Download Counts

Every file download is proxied through the portal. Before streaming begins, the event is recorded in SQLite.

Schema (downloads table):

Column	Type	Description
`id`	INTEGER PK	Auto-increment
`project`	TEXT	Project slug (`pets`)
`version`	TEXT	Release tag (`v2.1.0`)
`asset`	TEXT	Filename
`timestamp`	DATETIME	UTC
`ip_hash`	TEXT	SHA-256 of client IP (anonymised)
`user_agent`	TEXT	Client User-Agent string

M2 — Clone Activity

Every git clone or git fetch passes through the portal's git smart HTTP proxy. The info/refs request is logged before proxying.

Schema (clones table):

Column	Type	Description
`id`	INTEGER PK	Auto-increment
`project`	TEXT	Project slug
`timestamp`	DATETIME	UTC
`ip_hash`	TEXT	SHA-256 of client IP (anonymised)
`user_agent`	TEXT	Git client User-Agent (usually `git/2.x.x`)

See Git Clone Proxy for how this works.

M3 — Tool Usage Events

Every save file processed by the pets and prospects Blueprints is logged.

Schema (tool_usage table):

Column	Type	Description
`id`	INTEGER PK	Auto-increment
`tool`	TEXT	Blueprint slug (`pets`, `prospects`)
`event`	TEXT	`upload` or `download`
`timestamp`	DATETIME	UTC
`ip_hash`	TEXT	SHA-256 of client IP (anonymised)
`user_agent`	TEXT	Client User-Agent string

M4 — Catalog Interactions

Catalog usage is tracked to understand which data categories are most useful and what users are searching for.

Schema (catalog_interactions table):

Column	Type	Description
`id`	INTEGER PK	Auto-increment
`event`	TEXT	`category_view`, `search`, `diff_view`
`detail`	TEXT	Category name, search query, or diff target
`result_count`	INTEGER	Search result count (null for non-search events)
`timestamp`	DATETIME	UTC
`ip_hash`	TEXT	SHA-256 of client IP (anonymised)

M5 — Umami Web Analytics

Umami provides self-hosted, privacy-respecting page-level analytics and custom event tracking.

Page analytics (automatic): - Page views and unique visitors - Referrer tracking - Session duration - Browser, OS, device breakdown - GDPR-clean — no cookies, no consent banners needed - No data sent to third parties — runs entirely on our server

Custom events (via umami.track()):

Event name	Trigger	Payload
`save_uploaded`	File submitted to pets/prospects	`{tool}`
`save_downloaded`	Modified file sent to client	`{tool}`
`catalog_search`	Search form submitted	`{query, result_count}`
`catalog_category`	Category page viewed	`{category}`
`git_clone`	info/refs proxy hit	`{project}`

Umami runs as a systemd service on the same server (Node + SQLite, port 3001). The tracking script is served from our own server — no external requests by clients. Umami admin exposed at a restricted path via Caddy.

M6 — Internal Metrics Dashboard (`/metrics`)

A server-rendered summary page for operator use. Not publicly linked or accessible without a METRICS_TOKEN credential (query param or header).

Sections: - Downloads: total + per-project + per-version (last 30 / 90 / all days) - Clones: total + per-project (last 30 / 90 / all days) - Tool usage: save uploads and downloads per Blueprint - Catalog: top categories viewed, top search queries - Umami embed: iframe showing the Umami dashboard for page-level stats

Simple HTML tables — no charting library. Rendered server-side by a metrics Blueprint that queries SQLite directly.

Privacy

All client IP addresses are anonymised before storage:

import hashlib
ip_hash = hashlib.sha256(client_ip.encode()).hexdigest()

Raw IP addresses are never written to SQLite. Umami handles its own anonymisation internally (HASH_SALT configured at deployment). No personal data leaves the server.

Why Not Just Use GitLab Statistics?

GitLab EE's built-in statistics are limited:

No package registry download counts exposed via API
Clone counts only available as aggregate totals, no per-client detail
No release asset download counts in the API response
No tool usage or catalog interaction data at all

By owning the proxy and instrumentation layer, we capture everything with full fidelity and full control.