Skip to main content

Hybrid On-prem

Infra BOM Milestone Gatherers

The Git Activity Gatherer connects to Git providers (GitHub, GitLab, Bitbucket, Azure Repos), discovers and/or iterates repositories, optionally clones repositories, extracts analytics/metadata, and exports results to object storage (e.g., S3) or SFTP. The service runs as a single containerized application on a Linux host.

Here are the following instructions for integrating our platform.

  • Instructions for creating a dedicated On-prem machine, including SSH access for setup and ongoing maintenance.
  • Access to Git Provider and Project management systems.

Dedicated On-prem machine specs

1. Hardware requirements (by deployment size)

These tiers are guidelines. Actual sizing depends on repo count, repo size/history depth, and concurrency.

Recommended

CPUMEMORYSTORAGE
16 vCPU32 GB RAM1 TB+ SSD

Disk breakdown

  • Application & images: ~10 GB
  • Cache/working clones: 100–800 GB (dominant, depends on repo sizes & concurrency)
  • Logs: 10–20 GB (rotate/retain as per policy)
  • Buffer/working headroom: +20–30% of the above

2. Operating system requirements (64‑bit only)

Supported

  • Ubuntu 22.04 LTS, 24.04 LTS

Required OS features

  • systemd (for service management)
  • 64‑bit kernel (≥ 3.10)
  • Working DNS & NTP
  • SSD storage recommended (high I/O)

3. Runtime requirements (Docker only)

  • Docker Engine: 24.x+ (latest stable recommended)
  • Docker Compose Plugin: latest stable
  • User permissions: service user in docker group or sudo for Docker
  • Socket: /var/run/docker.sock accessible to the service user

4. Network & firewall requirements

The service is outbound‑only. No public inbound ports are required.

If a custom CA certificate is required to connect to any external services, it should be pre-installed on the node.

4.1 Outbound (required)
  • 443/TCP — HTTPS to the following destinations:
    • Milestone ingestion endpoints (results upload and analytics):
      • upload.mstone.ai — Milestone upload endpoint.
      • milestone-data-collection.s3.us-east-1.amazonaws.com — Milestone-managed S3 bucket for collected data.
    • Git provider APIs and Git-over-HTTPS — whichever applies to your environment:
      • GitHub / GitHub Enterprise: api.github.com, github.com, or your GitHub Enterprise Server host (e.g. github.yourcompany.com).
      • GitLab / GitLab Self-Managed: gitlab.com, or your self-managed host (e.g. gitlab.yourcompany.com).
      • Bitbucket / Bitbucket Data Center: api.bitbucket.org, bitbucket.org, or your Bitbucket Data Center host (e.g. bitbucket.yourcompany.com).
      • Azure DevOps / Azure Repos: dev.azure.com, *.visualstudio.com, vssps.dev.azure.com (auth), and *.dev.azure.com for org-scoped endpoints.
    • Project management platforms — e.g. *.atlassian.net, or your Jira Enterprise / Data Center host if applicable.
    • GenAI tool endpoints you want telemetry from — any subset of:
      • cursor.com
      • api.anthropic.com and claude.ai
      • api.github.com/copilot (path on api.github.com — already covered if GitHub is allow-listed above)
      • api.openai.com
    • Container registry — docker.io (Docker Hub) for pulling Milestone images, or a customer-owned/internal registry acting as a pull-through proxy or intermediary mirror to Docker Hub.
4.2 Outbound (optional)
  • 22/TCP — SFTP upload to private SFTP storage (if used).
  • 80/TCP — HTTP to on-prem/legacy endpoints (if applicable).
  • 8080/TCP or 3128/TCP — Proxy egress (corporate environments).
  • 123/UDP — NTP for time sync (recommended).
  • 53/TCP+UDP — DNS resolution (typically already permitted at the host/OS level).
4.3 Destinations
  • Git providers: API + Git over HTTPS (FQDNs per organization policy)
  • Object storage: S3 or S3‑compatible endpoint over HTTPS
  • Container registry
  • Proxy: corporate egress proxy where applicable
  • DNS & NTP: organization‑approved resolvers/time sources

Granting PM tool & Git access

We need access to your PM and Git to initiate the integration process. Please follow these two simple steps for each platform.

PM tool & Git access

Please provide us with the following information for PM provider & Git Provider access:

  1. URL (link) to your PM & Git.
  2. Username for both services.
  3. Password (or access token) for authentication for both.

Note: Regarding permissions, we only require read-access permissions.
If you have any additional questions about specific permissions to grant, please contact us.