Metrics Monitoring

Track and visualize numeric metrics alongside your check-ins

Overview

Telemetry.host allows you to send numeric metrics alongside your check-ins. These metrics are stored, tracked over time, and displayed in interactive timeline charts on your monitor’s detail page.

Use metrics to track:

  • System health (CPU load, memory usage, disk space)
  • Application performance (response times, queue sizes)
  • Business metrics (processed items, error counts)
  • Custom values specific to your use case

Sending Metrics

Include a metrics object in your check-in JSON payload. Each key-value pair should have a string key and a numeric value (integer or float).

Basic Example

curl -X POST https://telemetry.host/ping/{YOUR_MONITOR_ID} \
  -H "Content-Type: application/json" \
  -d '{
    "status": "success",
    "metrics": {
      "cpu_load": 45.2,
      "memory_mb": 2048,
      "disk_free_percent": 78.5
    }
  }'

Complete Example with All Fields

curl -X POST https://telemetry.host/ping/{YOUR_MONITOR_ID} \
  -H "Content-Type: application/json" \
  -d '{
    "status": "success",
    "message": "Health check completed",
    "duration": 150,
    "metrics": {
      "cpu_percent": 65.5,
      "memory_used_mb": 4096,
      "memory_free_mb": 12288,
      "disk_used_gb": 250,
      "disk_free_gb": 500,
      "load_average_1m": 2.5,
      "active_connections": 150,
      "requests_per_second": 1250.5
    }
  }'

Metrics Format

Supported Value Types

  • Integers: 42, 1024, 0
  • Floats: 45.2, 99.99, 0.001

Naming Conventions

Use descriptive, snake_case names for your metrics:

{
  "metrics": {
    "cpu_load_percent": 45.2,
    "memory_used_mb": 2048,
    "disk_free_gb": 150,
    "http_requests_total": 125000,
    "error_count": 0,
    "queue_depth": 42
  }
}

Maximum Metrics per Check-in

You can send multiple metrics in a single check-in. For best performance and visualization, we recommend limiting to 10-15 metrics per check-in.

Viewing Metrics

Metrics are displayed on the monitor detail page in an interactive timeline chart. Features include:

  • Synchronized hover: Hovering over any point shows values across all metrics at that time
  • Color-coded lines: Each metric has a distinct color for easy identification
  • Summary statistics: Shows latest value, min, max, and average for each metric
  • Collapsible panel: Hide/show the metrics section as needed

Common Use Cases

System Health Monitoring

#!/bin/bash
# system-metrics.sh - Send system metrics to Telemetry.host

CPU=$(top -bn1 | grep "Cpu(s)" | awk '{print $2}')
MEM_USED=$(free -m | awk 'NR==2{print $3}')
MEM_FREE=$(free -m | awk 'NR==2{print $4}')
DISK_USED=$(df -BG / | awk 'NR==2{print $3}' | tr -d 'G')
DISK_FREE=$(df -BG / | awk 'NR==2{print $4}' | tr -d 'G')
LOAD=$(uptime | awk -F'load average:' '{print $2}' | cut -d, -f1 | tr -d ' ')

curl -X POST https://telemetry.host/ping/{YOUR_MONITOR_ID} \
  -H "Content-Type: application/json" \
  -d "{
    \"status\": \"success\",
    \"metrics\": {
      \"cpu_percent\": $CPU,
      \"memory_used_mb\": $MEM_USED,
      \"memory_free_mb\": $MEM_FREE,
      \"disk_used_gb\": $DISK_USED,
      \"disk_free_gb\": $DISK_FREE,
      \"load_average\": $LOAD
    }
  }"

Database Backup with Size Tracking

#!/bin/bash
# backup-with-metrics.sh

START_TIME=$(date +%s)
BACKUP_FILE="/backups/db-$(date +%Y%m%d).sql.gz"

# Run backup
pg_dump mydb | gzip > "$BACKUP_FILE"
BACKUP_STATUS=$?

END_TIME=$(date +%s)
DURATION=$((END_TIME - START_TIME))
BACKUP_SIZE=$(stat -f%z "$BACKUP_FILE" 2>/dev/null || stat -c%s "$BACKUP_FILE")
BACKUP_SIZE_MB=$((BACKUP_SIZE / 1024 / 1024))

if [ $BACKUP_STATUS -eq 0 ]; then
  STATUS="success"
else
  STATUS="error"
fi

curl -X POST https://telemetry.host/ping/{YOUR_MONITOR_ID} \
  -H "Content-Type: application/json" \
  -d "{
    \"status\": \"$STATUS\",
    \"duration\": $DURATION,
    \"metrics\": {
      \"backup_size_mb\": $BACKUP_SIZE_MB,
      \"duration_seconds\": $DURATION
    }
  }"

Python Application Metrics

import requests
import psutil
import time

def send_metrics():
    """Send application metrics to Telemetry.host."""
    url = "https://telemetry.host/ping/{YOUR_MONITOR_ID}"
    
    # Gather metrics
    cpu = psutil.cpu_percent(interval=1)
    memory = psutil.virtual_memory()
    disk = psutil.disk_usage('/')
    
    data = {
        "status": "success",
        "metrics": {
            "cpu_percent": cpu,
            "memory_used_mb": memory.used // (1024 * 1024),
            "memory_available_mb": memory.available // (1024 * 1024),
            "memory_percent": memory.percent,
            "disk_used_gb": disk.used // (1024 ** 3),
            "disk_free_gb": disk.free // (1024 ** 3),
            "disk_percent": disk.percent
        }
    }
    
    try:
        response = requests.post(url, json=data, timeout=10)
        response.raise_for_status()
    except requests.RequestException as e:
        print(f"Failed to send metrics: {e}")

if __name__ == "__main__":
    send_metrics()

Docker Container Metrics

#!/bin/bash
# docker-metrics.sh - Monitor Docker container stats

CONTAINER_NAME="my-app"

# Get container stats (one snapshot)
STATS=$(docker stats --no-stream --format "{{.CPUPerc}},{{.MemUsage}}" $CONTAINER_NAME)
CPU=$(echo $STATS | cut -d',' -f1 | tr -d '%')
MEM_RAW=$(echo $STATS | cut -d',' -f2 | cut -d'/' -f1)
MEM_MB=$(echo $MEM_RAW | sed 's/MiB//' | sed 's/GiB/*1024/' | bc)

# Get container count
RUNNING_CONTAINERS=$(docker ps -q | wc -l)

curl -X POST https://telemetry.host/ping/{YOUR_MONITOR_ID} \
  -H "Content-Type: application/json" \
  -d "{
    \"status\": \"success\",
    \"metrics\": {
      \"container_cpu_percent\": $CPU,
      \"container_memory_mb\": $MEM_MB,
      \"running_containers\": $RUNNING_CONTAINERS
    }
  }"

Queue Processing Metrics

import requests
import redis

def send_queue_metrics():
    """Monitor Redis queue depth and processing rate."""
    r = redis.Redis(host='localhost', port=6379)
    
    queue_size = r.llen('task_queue')
    processed_today = int(r.get('processed_today') or 0)
    failed_today = int(r.get('failed_today') or 0)
    
    data = {
        "status": "success",
        "metrics": {
            "queue_depth": queue_size,
            "processed_today": processed_today,
            "failed_today": failed_today,
            "success_rate": (processed_today / (processed_today + failed_today) * 100) 
                           if (processed_today + failed_today) > 0 else 100
        }
    }
    
    requests.post("https://telemetry.host/ping/{YOUR_MONITOR_ID}", json=data)

Best Practices

1. Consistent Metric Names

Use the same metric names across all check-ins for proper timeline visualization:

// Good - consistent naming
{"metrics": {"cpu_percent": 45}}
{"metrics": {"cpu_percent": 52}}

// Bad - inconsistent naming (creates separate metrics)
{"metrics": {"cpu_percent": 45}}
{"metrics": {"cpu": 52}}

2. Use Appropriate Units

Include units in metric names for clarity:

{
  "metrics": {
    "memory_mb": 2048,        // Clear it's megabytes
    "disk_gb": 500,           // Clear it's gigabytes
    "duration_seconds": 120,   // Clear it's seconds
    "rate_per_second": 1500   // Clear it's a rate
  }
}

3. Track Deltas for Counters

For ever-increasing counters, consider sending deltas or rates:

{
  "metrics": {
    "requests_per_minute": 1500,  // Rate (better for visualization)
    "errors_last_hour": 5         // Delta (easier to spot issues)
  }
}

4. Set Up Alerts

While metrics visualization is read-only, you can still set up alerts based on check-in status. Update your script to set status: "error" when metrics exceed thresholds:

CPU=$(get_cpu_percent)
if [ $(echo "$CPU > 90" | bc) -eq 1 ]; then
  STATUS="error"
  MESSAGE="High CPU usage: ${CPU}%"
else
  STATUS="success"
  MESSAGE="Normal operation"
fi

curl -X POST https://telemetry.host/ping/{YOUR_MONITOR_ID} \
  -H "Content-Type: application/json" \
  -d "{\"status\": \"$STATUS\", \"message\": \"$MESSAGE\", \"metrics\": {\"cpu_percent\": $CPU}}"

Next Steps