Cron Job Monitoring: Best Practices for 2025
Learn how to monitor cron jobs effectively and avoid the most common pitfalls that lead to silent failures
Cron jobs are the backbone of automated system maintenance, but they’re also notoriously prone to silent failures. A backup script that hasn’t run in weeks, a cleanup job that’s been failing for months, or a critical report that never gets generated-these scenarios are all too common.
In this post, we’ll explore best practices for monitoring cron jobs in 2025 and show you how to catch failures before they become disasters.
The Problem with Cron
Cron is great at scheduling tasks, but terrible at telling you when things go wrong. By default:
- Failed jobs send email (that nobody reads)
- Output goes to
/dev/null(lost forever) - Exit codes are ignored
- No centralized visibility
This leads to silent failures-your cron jobs stop working, but you don’t know until it’s too late.
Best Practice #1: Always Capture Output
Never discard your script output:
# ❌ Bad: Output lost
0 2 * * * /path/to/backup.sh > /dev/null 2>&1
# ✅ Good: Output captured for debugging
0 2 * * * /path/to/backup.sh 2>&1 | curl -X POST https://telemetry.host/ping/YOUR_ID \
-H "Content-Type: text/plain" --data-binary @-
When something fails, you’ll have the full context to debug it.
Best Practice #2: Check Exit Codes
Always verify your script succeeded:
#!/bin/bash
set -e # Exit on any error
# Your commands here
pg_dump mydb > backup.sql
gzip backup.sql
# If we reach here, everything succeeded
echo "Backup completed successfully"
exit 0
The set -e ensures the script exits immediately on any error, making failures obvious.
Best Practice #3: Use Meaningful Timeouts
Set realistic timeouts that account for occasional delays:
# Daily backup at 2 AM
# Use 25-26 hour timeout (not exactly 24h)
# This allows for occasional delays without false alarms
https://telemetry.host/ping/PROJECT_KEY/timeout/26h/daily-backup?create=1
Too aggressive? False positives. Too loose? Real failures go unnoticed.
Best Practice #4: Test Your Monitoring
Before deploying, test both success and failure scenarios:
# Test success
./backup.sh && echo "✅ Success case works"
# Test failure
./backup.sh --force-error && echo "✅ Failure case detected"
Verify you receive notifications for failures.
Best Practice #5: Include Context in Reports
Don’t just report “success” or “failure”-include actionable information:
# ❌ Bad: No context
echo "Done" | curl -X POST $MONITOR_URL
# ✅ Good: Actionable information
echo "Backup completed: 2.5GB in 120 seconds, 15 tables backed up" | \
curl -X POST $MONITOR_URL
When debugging at 3 AM, you’ll thank yourself for the extra details.
Best Practice #6: Monitor Critical Dependencies
If your cron job depends on external services, monitor those too:
#!/bin/bash
# Check prerequisites
if ! pg_isready -q; then
echo "Database not available" | curl -X POST $MONITOR_URL
exit 1
fi
if [ ! -d "/backups" ]; then
echo "Backup directory missing" | curl -X POST $MONITOR_URL
exit 1
fi
# Proceed with backup...
Best Practice #7: Use Auto-Provisioning
Define monitors in your scripts using auto-provisioning URLs:
# This URL will create the monitor if it doesn't exist
MONITOR_URL="https://telemetry.host/ping/PROJECT_KEY/timeout/26h/db-backup?create=1"
# Now deploy this script anywhere-monitoring is automatic
./backup.sh 2>&1 | curl -X POST "$MONITOR_URL" \
-H "Content-Type: text/plain" --data-binary @-
Perfect for infrastructure-as-code and dynamic environments.
Best Practice #8: Separate Concerns
For complex jobs, monitor each critical step separately:
# Monitor backup creation
pg_dump mydb | gzip > backup.sql.gz
curl -X POST $BACKUP_MONITOR -d '{"status":"success"}'
# Monitor backup upload
aws s3 cp backup.sql.gz s3://backups/
curl -X POST $UPLOAD_MONITOR -d '{"status":"success"}'
# Monitor backup verification
gunzip -t backup.sql.gz
curl -X POST $VERIFY_MONITOR -d '{"status":"success"}'
This helps pinpoint exactly where failures occur.
Best Practice #9: Plan for Maintenance
Account for scheduled downtime in your monitoring:
# Use wider timeout during maintenance windows
# Normal: 25 hours
# During maintenance: 50 hours
if [ -f /etc/maintenance-mode ]; then
TIMEOUT="50h"
else
TIMEOUT="25h"
fi
curl -X POST "https://telemetry.host/ping/KEY/timeout/$TIMEOUT/backup"
Best Practice #10: Review Monitoring Regularly
Set a recurring task to review your monitors:
- Are timeouts still appropriate?
- Are false positives occurring?
- Are notifications reaching the right people?
- Are old monitors still needed?
Real-World Example
Here’s a production-ready backup script incorporating these practices:
#!/bin/bash
# production-backup.sh
set -euo pipefail
# Configuration
DB_NAME="production"
BACKUP_DIR="/backups"
MONITOR_URL="https://telemetry.host/ping/KEY/timeout/26h/prod-backup?create=1"
RETENTION_DAYS=30
# Validate prerequisites
[[ -d "$BACKUP_DIR" ]] || { echo "Backup dir missing"; exit 1; }
pg_isready -q || { echo "DB not ready"; exit 1; }
# Perform backup
TIMESTAMP=$(date +%Y%m%d_%H%M%S)
BACKUP_FILE="$BACKUP_DIR/${DB_NAME}_${TIMESTAMP}.sql.gz"
START_TIME=$(date +%s)
echo "Starting backup..."
pg_dump "$DB_NAME" | gzip > "$BACKUP_FILE"
# Verify
gunzip -t "$BACKUP_FILE" || { echo "Verification failed"; exit 1; }
# Cleanup old backups
find "$BACKUP_DIR" -name "*.sql.gz" -mtime +$RETENTION_DAYS -delete
# Report success
END_TIME=$(date +%s)
DURATION=$((END_TIME - START_TIME))
SIZE=$(du -h "$BACKUP_FILE" | cut -f1)
REMAINING=$(find "$BACKUP_DIR" -name "*.sql.gz" | wc -l)
{
echo "✅ Backup completed successfully"
echo "Duration: ${DURATION}s"
echo "Size: $SIZE"
echo "Backups retained: $REMAINING"
} | curl -X POST "$MONITOR_URL" \
-H "Content-Type: text/plain" --data-binary @-
echo "Done"
Conclusion
Monitoring cron jobs doesn’t have to be complicated. By following these best practices, you can:
- Catch failures early before they impact users
- Debug faster with full context logs
- Sleep better knowing your critical jobs are monitored
The key is to treat monitoring as a first-class concern, not an afterthought. Build it into your scripts from day one, and you’ll save yourself countless hours of debugging and firefighting.
Get Started
Ready to implement these practices? Check out our quickstart guide to set up your first monitor in 5 minutes.
For more examples, see our cron monitoring guide and backup monitoring examples.