Datadog will charge you $50K a year before you notice. That's fine when you're a Series B with a real SRE team. It's a punch in the face at seed.
Here's the stack we use for most early-stage SaaS, on AWS, with a small team.
Metrics: Prometheus + Grafana
Self-hosted on a single t3.medium or Amazon Managed Prometheus if you don't want to babysit it. Grafana Cloud's free tier handles dashboards for most teams comfortably.
Instrument with the OpenTelemetry Go/Python SDKs. Resist the urge to invent metrics—stick to RED (Rate, Errors, Duration) for services and USE (Utilization, Saturation, Errors) for infrastructure.
Logs: Loki, or CloudWatch with discipline
Loki indexes labels, not log content. That makes it dramatically cheaper than Elastic-based stacks. Grafana renders it cleanly. The trade-off is you can't full-text search arbitrarily—you query by label and grep within results.
Alternative: CloudWatch logs with Subscription Filters that ship to S3. Athena reads S3 cheaply for the archival case. Live debugging stays in CloudWatch Insights.
Traces: OpenTelemetry to SigNoz or Tempo
OpenTelemetry is the right abstraction—instrument once, swap backends without code changes. SigNoz (self-hosted or cloud) is a complete o11y product that's a real Datadog alternative at a fraction of the price. Tempo is the Grafana-native trace store and pairs naturally if you're already on Grafana.
Alerting: keep it spare
The fastest way to ruin observability is to ship 200 alerts on day one. Start with five: error budget exhausted, p99 latency over SLO, instance down, queue depth growing unbounded, error rate spike. Add others only when an incident teaches you one is missing.
Route to a single Slack channel and PagerDuty (or its OSS sibling OnCall). Anyone awake should be able to glance at the channel and know if anything's on fire.
What we'd actually spend
- Amazon Managed Prometheus + Grafana Cloud free: ~$50/mo at small scale.
- SigNoz self-hosted on EC2: ~$200/mo, including storage.
- Plus a few hours a month of upkeep.
That's $250–500/month for what would be $4,000+ on Datadog at the same scale. The trade is your time. Until you're past Series A, your time is cheaper than your tooling bill.