metrics
Metrics & Analytics
The portal captures every meaningful user action with full fidelity and zero third-party data sharing. Analytics is split into three layers:
- Custom SQLite metrics — structured event data, fully under our control
- Umami web analytics — self-hosted page-level and custom event analytics
- Internal dashboard — aggregated view of all of the above at
/metrics
M1 — Download Counts
Every file download is proxied through the portal. Before streaming begins, the event is recorded in SQLite.
Schema (downloads table):
| Column | Type | Description |
|---|---|---|
id |
INTEGER PK | Auto-increment |
project |
TEXT | Project slug (pets) |
version |
TEXT | Release tag (v2.1.0) |
asset |
TEXT | Filename |
timestamp |
DATETIME | UTC |
ip_hash |
TEXT | SHA-256 of client IP (anonymised) |
user_agent |
TEXT | Client User-Agent string |
M2 — Clone Activity
Every git clone or git fetch passes through the portal's git smart HTTP proxy.
The info/refs request is logged before proxying.
Schema (clones table):
| Column | Type | Description |
|---|---|---|
id |
INTEGER PK | Auto-increment |
project |
TEXT | Project slug |
timestamp |
DATETIME | UTC |
ip_hash |
TEXT | SHA-256 of client IP (anonymised) |
user_agent |
TEXT | Git client User-Agent (usually git/2.x.x) |
See Git Clone Proxy for how this works.
M3 — Tool Usage Events
Every save file processed by the pets and prospects Blueprints is logged.
Schema (tool_usage table):
| Column | Type | Description |
|---|---|---|
id |
INTEGER PK | Auto-increment |
tool |
TEXT | Blueprint slug (pets, prospects) |
event |
TEXT | upload or download |
timestamp |
DATETIME | UTC |
ip_hash |
TEXT | SHA-256 of client IP (anonymised) |
user_agent |
TEXT | Client User-Agent string |
M4 — Catalog Interactions
Catalog usage is tracked to understand which data categories are most useful and what users are searching for.
Schema (catalog_interactions table):
| Column | Type | Description |
|---|---|---|
id |
INTEGER PK | Auto-increment |
event |
TEXT | category_view, search, diff_view |
detail |
TEXT | Category name, search query, or diff target |
result_count |
INTEGER | Search result count (null for non-search events) |
timestamp |
DATETIME | UTC |
ip_hash |
TEXT | SHA-256 of client IP (anonymised) |
M5 — Umami Web Analytics
Umami provides self-hosted, privacy-respecting page-level analytics and custom event tracking.
Page analytics (automatic): - Page views and unique visitors - Referrer tracking - Session duration - Browser, OS, device breakdown - GDPR-clean — no cookies, no consent banners needed - No data sent to third parties — runs entirely on our server
Custom events (via umami.track()):
| Event name | Trigger | Payload |
|---|---|---|
save_uploaded |
File submitted to pets/prospects | {tool} |
save_downloaded |
Modified file sent to client | {tool} |
catalog_search |
Search form submitted | {query, result_count} |
catalog_category |
Category page viewed | {category} |
git_clone |
info/refs proxy hit | {project} |
Umami runs as a systemd service on the same server (Node + SQLite, port 3001). The tracking script is served from our own server — no external requests by clients. Umami admin exposed at a restricted path via Caddy.
M6 — Internal Metrics Dashboard (/metrics)
A server-rendered summary page for operator use. Not publicly linked or accessible
without a METRICS_TOKEN credential (query param or header).
Sections: - Downloads: total + per-project + per-version (last 30 / 90 / all days) - Clones: total + per-project (last 30 / 90 / all days) - Tool usage: save uploads and downloads per Blueprint - Catalog: top categories viewed, top search queries - Umami embed: iframe showing the Umami dashboard for page-level stats
Simple HTML tables — no charting library. Rendered server-side by a metrics
Blueprint that queries SQLite directly.
Privacy
All client IP addresses are anonymised before storage:
import hashlib
ip_hash = hashlib.sha256(client_ip.encode()).hexdigest()
Raw IP addresses are never written to SQLite. Umami handles its own anonymisation
internally (HASH_SALT configured at deployment). No personal data leaves the server.
Why Not Just Use GitLab Statistics?
GitLab EE's built-in statistics are limited:
- No package registry download counts exposed via API
- Clone counts only available as aggregate totals, no per-client detail
- No release asset download counts in the API response
- No tool usage or catalog interaction data at all
By owning the proxy and instrumentation layer, we capture everything with full fidelity and full control.