Saltar al contenido principal

๐Ÿš€ Future Improvements

SigNoz Icon

This document outlines what to add next โ€” in order of priority โ€” to make your SigNoz setup more robust, secure, and useful.


1. ๐Ÿ”” Alerting Setupโ€‹

SigNoz supports alerts that notify you when something breaks โ€” before your users report it.

What to set up:

  • Alert when error rate > 5%
  • Alert when API response time > 2000ms
  • Alert when a service goes down

How:

  1. Open SigNoz dashboard โ†’ Alerts
  2. Click New Alert
  3. Choose metric (e.g., signoz_calls_total with filter status_code=error)
  4. Set threshold and notification channel

Notification channels you can add:

  • Slack webhook
  • PagerDuty
  • Email (SMTP)
  • OpsGenie

Future task: Set up at least a Slack channel for production error alerts.


2. ๐Ÿณ Kubernetes Integrationโ€‹

If your apps move to Kubernetes (K8s), the setup changes slightly:

  • OTEL Collector runs as a DaemonSet (one per node)
  • Apps auto-discover the collector via K8s DNS
  • Logs are collected from pod stdout instead of PM2 files

What to do:

# Install cert-manager (required)
kubectl apply -f https://github.com/cert-manager/cert-manager/releases/download/v1.14.0/cert-manager.yaml

# Install SigNoz on K8s via Helm
helm repo add signoz https://charts.signoz.io
helm install my-release signoz/signoz --namespace platform --create-namespace

Future task: Migrate SigNoz from Docker Compose to Helm when moving to K8s.


3. ๐Ÿ–ฅ๏ธ Advanced Dashboardsโ€‹

The default SigNoz dashboards cover basics. Build custom dashboards for your services:

Ideas for HealthTune API dashboard:

  • Requests per second by endpoint
  • Top 10 slowest API endpoints
  • DB query time distribution
  • Error rate by endpoint

Ideas for TrackX API dashboard:

  • Active users (request count per hour)
  • Authentication success / failure rate
  • Slow queries heatmap

How to create a custom dashboard:

  1. SigNoz โ†’ Dashboards โ†’ New Dashboard
  2. Add panels using PromQL or SigNoz query builder
  3. Save and pin to your team's home view

4. ๐Ÿ“ Log Pipelines & Parsingโ€‹

Currently logs are collected raw. Add log parsing to extract structured fields:

For example, if your app logs JSON:

{"level":"error","message":"DB timeout","duration":5000,"service":"healthtune"}

Add a log processor in the collector config:

processors:
logstransform:
operators:
- type: json_parser
field: body

This lets you filter and search logs by level, service, duration in SigNoz.


5. ๐Ÿ”’ TLS / HTTPS for the Dashboardโ€‹

Currently SigNoz is accessed over plain HTTP (http://server:3301). For production, add HTTPS.

Options:

Option A โ€” Nginx reverse proxy with Let's Encrypt:

server {
listen 443 ssl;
server_name signoz.yourcompany.com;

ssl_certificate /etc/letsencrypt/live/signoz.yourcompany.com/fullchain.pem;
ssl_certificate_key /etc/letsencrypt/live/signoz.yourcompany.com/privkey.pem;

location / {
proxy_pass http://localhost:3301;
proxy_set_header Host $host;
}
}

Option B โ€” Cloudflare proxy (simplest for GCP): Point your domain to the VM, enable Cloudflare orange-cloud, Cloudflare handles SSL automatically.


6. ๐Ÿ’พ Data Retention Policyโ€‹

ClickHouse stores all telemetry data forever by default. Set retention to avoid running out of disk:

Edit /home/signoz/deploy/docker/clickhouse-config.xml or via SQL:

-- Keep traces for 30 days, logs for 7 days
ALTER TABLE signoz_traces.distributed_signoz_index_v3
MODIFY TTL toDateTime(timestamp) + INTERVAL 30 DAY;

Or configure in the SigNoz UI under Settings โ†’ Retention.

Recommended retention:

  • Traces: 15โ€“30 days
  • Logs: 7โ€“14 days
  • Metrics: 30โ€“90 days

7. ๐Ÿ“Š More Services to Instrumentโ€‹

Add OTEL tracing to additional services on the VM:

ServiceHow
RedisUse @opentelemetry/instrumentation-ioredis
ElasticsearchUse @opentelemetry/instrumentation-elasticsearch
n8n workflowsAdd OTEL env variables to n8n Docker config
JenkinsJenkins OTEL plugin available

8. ๐Ÿ“ฆ Backup Strategy for ClickHouseโ€‹

SigNoz data is stored in ClickHouse volumes. Set up backups:

# Backup ClickHouse data directory
docker exec clickhouse clickhouse-client --query "BACKUP DATABASE signoz_traces TO Disk('backups', 'signoz_backup.zip')"

# Or simply snapshot the Docker volume
docker run --rm \
-v signoz_clickhouse-data:/data \
-v $(pwd):/backup \
alpine tar czf /backup/clickhouse-backup.tar.gz /data

Future task: Set up a cron job for weekly backups to GCP Cloud Storage.


9. ๐Ÿ‘ฅ Multi-User Access Controlโ€‹

SigNoz supports team accounts. Add your team members:

  1. SigNoz โ†’ Settings โ†’ Organization
  2. Invite Member โ†’ enter email
  3. Assign role: Admin, Editor, or Viewer

Recommended structure:

  • Dev team: Editor (can see everything, create dashboards)
  • Manager: Viewer (read-only)
  • DevOps lead: Admin

๐Ÿ—“๏ธ Suggested Roadmapโ€‹

PriorityTaskEffort
๐Ÿ”ด HighSet up Slack alerting for errors30 mins
๐Ÿ”ด HighSet data retention to 30 days15 mins
๐ŸŸก MediumAdd HTTPS via Cloudflare or Nginx1โ€“2 hours
๐ŸŸก MediumBuild custom dashboards per service2โ€“3 hours
๐ŸŸข LowStructured log parsing1 hour
๐ŸŸข LowKubernetes migrationWhen ready
๐ŸŸข LowClickHouse backup automation1โ€“2 hours


Read in Sequenceโ€‹