🚀 Future Improvements
![]()
This document outlines what to add next — in order of priority — to make your SigNoz setup more robust, secure, and useful.
1. 🔔 Alerting Setup
SigNoz supports alerts that notify you when something breaks — before your users report it.
What to set up:
- Alert when error rate > 5%
- Alert when API response time > 2000ms
- Alert when a service goes down
How:
- Open SigNoz dashboard → Alerts
- Click New Alert
- Choose metric (e.g.,
signoz_calls_totalwith filterstatus_code=error) - Set threshold and notification channel
Notification channels you can add:
- Slack webhook
- PagerDuty
- Email (SMTP)
- OpsGenie
Future task: Set up at least a Slack channel for production error alerts.
2. 🐳 Kubernetes Integration
If your apps move to Kubernetes (K8s), the setup changes slightly:
- OTEL Collector runs as a DaemonSet (one per node)
- Apps auto-discover the collector via K8s DNS
- Logs are collected from pod stdout instead of PM2 files
What to do:
# Install cert-manager (required)
kubectl apply -f https://github.com/cert-manager/cert-manager/releases/download/v1.14.0/cert-manager.yaml
# Install SigNoz on K8s via Helm
helm repo add signoz https://charts.signoz.io
helm install my-release signoz/signoz --namespace platform --create-namespace
Future task: Migrate SigNoz from Docker Compose to Helm when moving to K8s.
3. 🖥️ Advanced Dashboards
The default SigNoz dashboards cover basics. Build custom dashboards for your services:
Ideas for HealthTune API dashboard:
- Requests per second by endpoint
- Top 10 slowest API endpoints
- DB query time distribution
- Error rate by endpoint
Ideas for TrackX API dashboard:
- Active users (request count per hour)
- Authentication success / failure rate
- Slow queries heatmap
How to create a custom dashboard:
- SigNoz → Dashboards → New Dashboard
- Add panels using PromQL or SigNoz query builder
- Save and pin to your team's home view
4. 📝 Log Pipelines & Parsing
Currently logs are collected raw. Add log parsing to extract structured fields:
For example, if your app logs JSON:
{"level":"error","message":"DB timeout","duration":5000,"service":"healthtune"}
Add a log processor in the collector config:
processors:
logstransform:
operators:
- type: json_parser
field: body
This lets you filter and search logs by level, service, duration in SigNoz.
5. 🔒 TLS / HTTPS for the Dashboard
Currently SigNoz is accessed over plain HTTP (http://server:3301). For production, add HTTPS.
Options:
Option A — Nginx reverse proxy with Let's Encrypt:
server {
listen 443 ssl;
server_name signoz.yourcompany.com;
ssl_certificate /etc/letsencrypt/live/signoz.yourcompany.com/fullchain.pem;
ssl_certificate_key /etc/letsencrypt/live/signoz.yourcompany.com/privkey.pem;
location / {
proxy_pass http://localhost:3301;
proxy_set_header Host $host;
}
}
Option B — Cloudflare proxy (simplest for GCP): Point your domain to the VM, enable Cloudflare orange-cloud, Cloudflare handles SSL automatically.
6. 💾 Data Retention Policy
ClickHouse stores all telemetry data forever by default. Set retention to avoid running out of disk:
Edit /home/signoz/deploy/docker/clickhouse-config.xml or via SQL:
-- Keep traces for 30 days, logs for 7 days
ALTER TABLE signoz_traces.distributed_signoz_index_v3
MODIFY TTL toDateTime(timestamp) + INTERVAL 30 DAY;
Or configure in the SigNoz UI under Settings → Retention.
Recommended retention:
- Traces: 15–30 days
- Logs: 7–14 days
- Metrics: 30–90 days
7. 📊 More Services to Instrument
Add OTEL tracing to additional services on the VM:
| Service | How |
|---|---|
| Redis | Use @opentelemetry/instrumentation-ioredis |
| Elasticsearch | Use @opentelemetry/instrumentation-elasticsearch |
| n8n workflows | Add OTEL env variables to n8n Docker config |
| Jenkins | Jenkins OTEL plugin available |
8. 📦 Backup Strategy for ClickHouse
SigNoz data is stored in ClickHouse volumes. Set up backups:
# Backup ClickHouse data directory
docker exec clickhouse clickhouse-client --query "BACKUP DATABASE signoz_traces TO Disk('backups', 'signoz_backup.zip')"
# Or simply snapshot the Docker volume
docker run --rm \
-v signoz_clickhouse-data:/data \
-v $(pwd):/backup \
alpine tar czf /backup/clickhouse-backup.tar.gz /data
Future task: Set up a cron job for weekly backups to GCP Cloud Storage.
9. 👥 Multi-User Access Control
SigNoz supports team accounts. Add your team members:
- SigNoz → Settings → Organization
- Invite Member → enter email
- Assign role:
Admin,Editor, orViewer
Recommended structure:
- Dev team:
Editor(can see everything, create dashboards) - Manager:
Viewer(read-only) - DevOps lead:
Admin
🗓️ Suggested Roadmap
| Priority | Task | Effort |
|---|---|---|
| 🔴 High | Set up Slack alerting for errors | 30 mins |
| 🔴 High | Set data retention to 30 days | 15 mins |
| 🟡 Medium | Add HTTPS via Cloudflare or Nginx | 1–2 hours |
| 🟡 Medium | Build custom dashboards per service | 2–3 hours |
| 🟢 Low | Structured log parsing | 1 hour |
| 🟢 Low | Kubernetes migration | When ready |
| 🟢 Low | ClickHouse backup automation | 1–2 hours |
Official Documentation Links
- SigNoz alerts and notifications
- SigNoz dashboards
- SigNoz retention and storage
- SigNoz Kubernetes install (Helm)
- OpenTelemetry Kubernetes deployment
- OpenTelemetry collector processors
Read in Sequence
- Previous: 7-troubleshooting.md
- Back to start: README.md
- Restart sequence: 1-introduction.md