Jordan Lewis
banner
largedatabank.com
Jordan Lewis
@largedatabank.com
Sr Director of engineering @ Cockroach Labs.

Brooklyn

Check me out: https://largedatabank.com

The old site: https://twitter.com/largedatabank
lol rip
November 26, 2024 at 9:46 PM
Result: Clean latency metrics that automatically filter out low-traffic noise. Your dashboards stay focused on clusters that matter.

The earlier graph is now trimmed, showing just the series that matter.

Shoutout to Datadog's query engine team for enabling this kind of creative solution! 🙌

5/5
November 26, 2024 at 2:07 AM
The actual query looks like this:

cockroachdb.sql.conn.latency * (cutoff_min(cockroachdb.sql.new_conns, 5) / cutoff_min(cockroachdb.sql.new_conns, 5))

When new_conns < 5, the series becomes null and nullifies the latency graph.

4/n
November 26, 2024 at 2:07 AM
The solution:
1. Use cutoff_min() on your modulating metric (the connection rate, in the example)
2. Multiply your target metric by modulating_metric/modulating_metric

This creates a 1 (keep) or null (filter) multiplier! 🪄

3/n
November 26, 2024 at 2:07 AM
An example problem: Monitoring SQL connection latency across a fleet of CockroachDB clusters gets noisy when some clusters have quiet workloads. Low-traffic clusters skew dashboards 📊

How do we monitor SQL connection latency, but ONLY when connection rates are high enough to be meaningful?

2/n
November 26, 2024 at 2:07 AM
Probably, but you'd just have to do everything yourself which may or may not appeal or ultimately be cost effective to maintain.
November 18, 2024 at 10:48 PM
The ish is that the cheap log tier (flex) doesn't support arbitrary metric creation. The more expensive one (standard, what they used to have exclusively) supports this in full. In practice you can use a mixture of the two tiers, but I do hope they improve this behavior
November 18, 2024 at 4:09 AM
I'm not sure what you mean by visualizing them over time as a trace, but I think so? For example, you could write a filter on a "trace id" attribute and show log events with a given trace id.

No need to select dimensions, no.

Deriving SLOs and metrics - yes, ish...
November 18, 2024 at 4:08 AM
Datadog's flex logs system provides a reasonable method in my experience - filter / group by / aggregations across logs with arbitrary attributes, plus graphing on top of that output. Or did you mean something more specific?
November 18, 2024 at 3:59 AM
bullet journal of best bullet journal illustrations type beat
November 15, 2024 at 4:46 PM
once a scene starts, it's hard to de-scene it! I'm clearly part of the problem.

goodhang.org/p/scene-me-u...
Scene Me Up
Harry on the good part about scenes
goodhang.org
November 15, 2024 at 4:45 PM
yes why is that
May 1, 2023 at 3:12 PM
rip
April 26, 2023 at 12:15 AM
your what
April 25, 2023 at 11:52 PM