metrics

Metrics & Analytics

The portal captures every meaningful user action with full fidelity and zero third-party data sharing. Analytics is split into three layers:

  1. Custom SQLite metrics — structured event data, fully under our control
  2. Umami web analytics — self-hosted page-level and custom event analytics
  3. Internal dashboard — aggregated view of all of the above at /metrics

M1 — Download Counts

Every file download is proxied through the portal. Before streaming begins, the event is recorded in SQLite.

Schema (downloads table):

Column Type Description
id INTEGER PK Auto-increment
project TEXT Project slug (pets)
version TEXT Release tag (v2.1.0)
asset TEXT Filename
timestamp DATETIME UTC
ip_hash TEXT SHA-256 of client IP (anonymised)
user_agent TEXT Client User-Agent string

M2 — Clone Activity

Every git clone or git fetch passes through the portal's git smart HTTP proxy. The info/refs request is logged before proxying.

Schema (clones table):

Column Type Description
id INTEGER PK Auto-increment
project TEXT Project slug
timestamp DATETIME UTC
ip_hash TEXT SHA-256 of client IP (anonymised)
user_agent TEXT Git client User-Agent (usually git/2.x.x)

See Git Clone Proxy for how this works.


M3 — Tool Usage Events

Every save file processed by the pets and prospects Blueprints is logged.

Schema (tool_usage table):

Column Type Description
id INTEGER PK Auto-increment
tool TEXT Blueprint slug (pets, prospects)
event TEXT upload or download
timestamp DATETIME UTC
ip_hash TEXT SHA-256 of client IP (anonymised)
user_agent TEXT Client User-Agent string

M4 — Catalog Interactions

Catalog usage is tracked to understand which data categories are most useful and what users are searching for.

Schema (catalog_interactions table):

Column Type Description
id INTEGER PK Auto-increment
event TEXT category_view, search, diff_view
detail TEXT Category name, search query, or diff target
result_count INTEGER Search result count (null for non-search events)
timestamp DATETIME UTC
ip_hash TEXT SHA-256 of client IP (anonymised)

M5 — Umami Web Analytics

Umami provides self-hosted, privacy-respecting page-level analytics and custom event tracking.

Page analytics (automatic): - Page views and unique visitors - Referrer tracking - Session duration - Browser, OS, device breakdown - GDPR-clean — no cookies, no consent banners needed - No data sent to third parties — runs entirely on our server

Custom events (via umami.track()):

Event name Trigger Payload
save_uploaded File submitted to pets/prospects {tool}
save_downloaded Modified file sent to client {tool}
catalog_search Search form submitted {query, result_count}
catalog_category Category page viewed {category}
git_clone info/refs proxy hit {project}

Umami runs as a systemd service on the same server (Node + SQLite, port 3001). The tracking script is served from our own server — no external requests by clients. Umami admin exposed at a restricted path via Caddy.


M6 — Internal Metrics Dashboard (/metrics)

A server-rendered summary page for operator use. Not publicly linked or accessible without a METRICS_TOKEN credential (query param or header).

Sections: - Downloads: total + per-project + per-version (last 30 / 90 / all days) - Clones: total + per-project (last 30 / 90 / all days) - Tool usage: save uploads and downloads per Blueprint - Catalog: top categories viewed, top search queries - Umami embed: iframe showing the Umami dashboard for page-level stats

Simple HTML tables — no charting library. Rendered server-side by a metrics Blueprint that queries SQLite directly.


Privacy

All client IP addresses are anonymised before storage:

import hashlib
ip_hash = hashlib.sha256(client_ip.encode()).hexdigest()

Raw IP addresses are never written to SQLite. Umami handles its own anonymisation internally (HASH_SALT configured at deployment). No personal data leaves the server.


Why Not Just Use GitLab Statistics?

GitLab EE's built-in statistics are limited:

  • No package registry download counts exposed via API
  • Clone counts only available as aggregate totals, no per-client detail
  • No release asset download counts in the API response
  • No tool usage or catalog interaction data at all

By owning the proxy and instrumentation layer, we capture everything with full fidelity and full control.

Back to Docs