> ## Documentation Index
> Fetch the complete documentation index at: https://docs.obsy.ai/llms.txt
> Use this file to discover all available pages before exploring further.

# Reliability Reports

> On-demand OTel collector health reports for your clusters.

Reliability Reports give you a snapshot of your OTel collector's health, pipeline efficiency, and telemetry quality at a point in time. They're useful for:

* Weekly engineering health reviews
* Investigating whether the collector is dropping data
* Sharing collector status with stakeholders who don't have Obsy access

Go to **Reliability Reports** in the sidebar.

***

## Generating a report

1. Click **Generate report**.
2. Select the **cluster** and the **observability platform** to read metrics from.
3. Click **Generate**.

The report is generated by querying your platform (Datadog or Grafana) for OTel collector metrics. It covers the **last 24 hours**.

***

## Report sections

### Collector health summary

An overall status (Healthy / Degraded / Critical) based on error rates and queue depth.

### Ingestion metrics

| Metric                      | Description                             |
| --------------------------- | --------------------------------------- |
| Spans accepted              | Total spans received by the gateway     |
| Spans rejected              | Spans dropped by the memory limiter     |
| Spans exported              | Spans successfully sent to the platform |
| Metrics accepted / exported | Same for metric data points             |
| Logs accepted / exported    | Same for log records                    |

### Export health

Shows the ratio of sent vs failed exports per signal type. A high failure ratio indicates a problem with the exporter config or the upstream platform's ingestion endpoint.

### Resource usage

| Metric      | Description                                         |
| ----------- | --------------------------------------------------- |
| Memory RSS  | Gateway process memory usage                        |
| CPU seconds | Gateway CPU time                                    |
| Queue size  | Current number of items waiting in the export queue |

A consistently full queue (near the configured queue capacity) means the exporter can't keep up with ingestion — consider scaling the gateway or reducing sampling rate.

### Golden signal coverage

For each golden signal (latency, traffic, errors, saturation), shows whether the collector is receiving and forwarding data for it. A missing signal means your services aren't instrumented for it.

***

## Viewing past reports

Reports are stored and accessible from the **Reliability Reports** list. Click any report to view or share it. Each report has a permalink suitable for sharing with stakeholders.

***

## Datadog metric naming

Obsy queries Datadog using the native OTel metric names (without the `_total` suffix):

```
otelcol_receiver_accepted_spans      ✓ correct
otelcol_receiver_accepted_spans_total ✗ not used (Prometheus naming)
```

This is correct because the Datadog OTel exporter strips the `_total` suffix when converting OTLP monotonic sums to Datadog counters. If you're building your own Datadog queries for OTel metrics, use the names without `_total`.
