Column monitors
Last updated
Was this helpful?
Last updated
Was this helpful?
A column monitor in Soda tracks a specific statistical metric for a given column over time. It helps detect unusual patterns or unexpected changes in column behavior, such as spikes in missing values or shifts in averages.
You can find column monitors by opening the Metric Monitors tab on any dataset and scrolling to the bottom of the page. This section lists all active column monitors in a structured, searchable view. The list can be sorted by recency or by the number of detected anomalies, allowing you to quickly focus on the most relevant issues.
Unlike dataset-level monitors, which can be applied at the data source level, column monitors are configured at the dataset level and are tailored to specific use cases. It is recommended to add column monitors only to columns where changes are likely to reflect actual data quality issues. Adding too many monitors may increase false positives and create unnecessary noise.
For column monitors to work, a time partition column must be defined. Soda uses this column to divide the data into time-based partitions, typically by day, and calculates the selected metrics within each partition. The column must be a timestamp and should reflect when records arrive in the database to ensure accurate and meaningful results.
For each dataset, you’ll see a scrollable list that includes:
Result of the anomaly detection: Anomaly, Expected or Unkown (not evaluated yet)
Column name
Metric name (e.g. Missing values percentage, Average)
Column being tracked
Latest value
Trend sparkline
At the bottom of the list it is possible to load more monitors. And every monitor can be deleted and configured with opt-in notifications.
Missing values percentage
Number of missing values relative to the number of rows in the partition, expressed as percentage
Average
Average of all values in the partition, only supported in numeric columns
Most recent timestamp
Time difference between scan time and the maximum timestamp in the column (within the partition), only supported for date, datetime and time columns
Stay tuned as we are releasing many more metrics in the coming weeks!
Column monitors can be added one by one or in bulk. When mulitple columns are selected only metrics that are applicable to all columns will be shown.
Open the column monitor wizars
In the Metric Monitors dashboard, click Add Column Monitors.
Select columns
Search or scroll your table’s columns.
Check one or many boxes to select columns in bulk.
Column monitors are typed: metrics will appear as long as the necessary data type is available. For example, if a column type is str
(text based), it will not be possible to enable numeric metrics.
Pick metrics
Select the metrics of interest.
Search or expand metrics for further configuration:
Valid Range: define MIN and MAX values the metric can take (defaults to –∞/∞ or 0–∞ for time-based metrics).
Threshold Strategy: choose whether to alert on the Upper range, the Lower range, or both.
Exclusion Values: specify literal values or ranges to ignore when marking anomalies.
Add monitors
Once you’ve selected your columns and toggled the desired metrics on, click Add Monitors.
Empty monitors will be added to the list
And at the top of the page you will be prompt to run a Historical Metric Collection Scan.
Tip: add all your column monitors first, then run the historical scan in one go. This will save time and computing costs, and ensures every monitor shares the same look-back window.
Column Monitors can be configured when setting them up and while they're in production. To fine-tune the monitor to your specific needs, go to the page for each specific metric.
Learn more about