Integration

Connect Matcha to the infrastructure signals you already use.

Bring together GPU telemetry, workload traces, cluster metadata, and cost data without replacing your existing observability stack.

Integration

Connect Matcha to the infrastructure signals you already use.

Bring together GPU telemetry, workload traces, cluster metadata, and cost data without replacing your existing observability stack.

Integration

Connect Matcha to the infrastructure signals you already use.

Bring together GPU telemetry, workload traces, cluster metadata, and cost data without replacing your existing observability stack.

All

GPU Telemetry

Workload Traces

Cluster Metadata

Observability

Cost & Export

Logo
NVIDIA DCGM

Collect GPU power, utilization, memory, temperature, and health metrics.

Logo
NVIDIA DCGM

Collect GPU power, utilization, memory, temperature, and health metrics.

Logo
Kubernetes

Map pods, jobs, namespaces, nodes, and scheduling context to GPU energy.

Logo
Kubernetes

Map pods, jobs, namespaces, nodes, and scheduling context to GPU energy.

Logo
Slurm

Connect training jobs, allocations, users, and cluster scheduling metadata.

Logo
Slurm

Connect training jobs, allocations, users, and cluster scheduling metadata.

Logo
PyTorch

Attach training runs, steps, duration, model metadata, and experiment context.

Logo
PyTorch

Attach training runs, steps, duration, model metadata, and experiment context.

Logo
vLLM

Track inference requests, batches, latency, tokens, and serving behavior.

Logo
vLLM

Track inference requests, batches, latency, tokens, and serving behavior.

Logo
OpenTelemetry

Use traces, spans, and service events to connect workloads with infrastructure signals.

Logo
OpenTelemetry

Use traces, spans, and service events to connect workloads with infrastructure signals.

Logo
Prometheus

Scrape, store, and export attributed energy metrics.

Logo
Prometheus

Scrape, store, and export attributed energy metrics.

Logo
Grafana

Visualize energy, cost, utilization, and workload attribution dashboards.

Logo
Grafana

Visualize energy, cost, utilization, and workload attribution dashboards.

Logo
Datadog

Send energy insights into existing infrastructure monitoring and logs.

Logo
Datadog

Send energy insights into existing infrastructure monitoring and logs.

Logo
Hugging Face

Connect model, fine-tuning, and experiment metadata to energy usage.

Logo
Hugging Face

Connect model, fine-tuning, and experiment metadata to energy usage.

Logo
Amazon S3

Store telemetry, traces, and reports for downstream workflows.

Logo
Amazon S3

Store telemetry, traces, and reports for downstream workflows.

Logo
NVML

Access low-level NVIDIA GPU telemetry for per-device energy monitoring.

Logo
NVML

Access low-level NVIDIA GPU telemetry for per-device energy monitoring.

Bring workload-level energy visibility to your AI infrastructure.

We’re working with early AI infrastructure teams, GPU operators, and enterprises running training or inference workloads.

Tranquil Rural Scene

Bring workload-level energy visibility to your AI infrastructure.

We’re working with early AI infrastructure teams, GPU operators, and enterprises running training or inference workloads.

Tranquil Rural Scene

Bring workload-level energy visibility to your AI infrastructure.

We’re working with early AI infrastructure teams, GPU operators, and enterprises running training or inference workloads.

Tranquil Rural Scene