Be first to try Soda's new AI-powered metrics observability, and collaborative data contracts.
Try Soda Now!
LogoLogo
  • What is Soda?
  • Quickstart
  • Data Observability
    • Metric Monitoring dashboard
      • Dataset monitors
      • Column monitors
    • Metric monitor page
  • Data Testing
    • Git-managed Data Contracts
      • Install and Configure
      • Create and Edit Contracts
      • Verify a contract
    • Cloud-managed Data Contract
      • Author a Contract in Soda Cloud
      • Verify a contract
  • Onboard datasets on Soda Cloud
  • Manage Issues
    • Organization dashboard
    • Browse Datasets
    • Dataset dashboard
    • Browse Checks
    • Check and dataset attributes
    • Analyze monitor and check results
    • Notifications
    • Incidents
  • Dataset Attributes & Responsibilities
  • Deployment options
    • Deploy Soda Agent
      • Deploy a Soda Agent in a Kubernetes cluster
      • Deploy a Soda Agent in an Amazon EKS cluster
      • Deploy a Soda Agent in an Azure AKS cluster
      • Deploy a Soda Agent in a Google GKE cluster
      • Soda Agent Extra
  • Organization and Admin Settings
    • General Settings
    • User management
    • User And User Group Management with SSO
    • Global and Dataset Roles
    • Integrations
  • Integrations
    • Alation
    • Atlan
    • Metaphor
    • Purview
    • Jira
    • ServiceNow
    • Slack
    • MS Teams
    • Webhook
  • Reference
    • Generate API keys
    • Python API
    • CLI Reference
    • Contract Language Reference
    • Data source reference for Soda Core
    • Rest API
    • Webhook API
Powered by GitBook
On this page
  • General Guidelines
  • PostgreSQL
  • Snowflake
  • Databricks

Was this helpful?

Export as PDF
  1. Reference

Data source reference for Soda Core

This page lists the supported data source types and their required connection parameters for use with Soda Core.

Soda uses the official Python drivers for each supported data source. The configuration examples below include the default required fields, but you can extend them with any additional parameters supported by the underlying driver.

Each data source configuration must be written in a YAML file and passed as an argument using the CLI or Python API.

General Guidelines

  • Each configuration must include type, name, and a connection block.

  • Use the exact structure required by the underlying Python driver.

  • Test the connection before using the configuration in a contract

soda data-source test -ds ds.yml

Connect to a Data Source Already Onboarded in Soda Cloud (via Soda Agent)

You can run verifications using Soda Core (local execution) or a Soda Agent (remote execution). To ensure consistency and compatibility, you must use the same data source name in both your local configuration for Soda Core and in Soda Cloud. See: Onboard datasets on Soda Cloud

This matching by name ensures that the data source is recognized and treated as the same across both execution modes, whether you’re running locally in Soda Core or remotely via a Soda Agent.


Onboard a Data Source in Soda Cloud After Using Soda Core

It’s also possible to onboard a data source to Soda Cloud and a Soda Agent after it was onboarded using Soda Core.

To learn how: Onboard datasets on Soda Cloud

Using Environment Variables

You can reference environment variables in your data source configuration. This is useful for securely managing sensitive values (like credentials) or dynamically setting parameters based on your environment (e.g., dev, staging, prod).

Example:

type: postgres
name: postgres
connection:
  host:
  port:
  database:
  user: ${env.SNOWFLAKE_USERNAME}
  password: ${env.SNOWFLAKE_PASSWORD}

Environment variables must be available in the runtime environment where Soda is executed (e.g., your terminal, CI/CD runner, or Docker container).


PostgreSQL

Install the following package:

pip install -i https://pypi.dev.sodadata.io/simple -U soda-postgres

Data source YAML

type: postgres
name: postgres
connection:
  host:
  port:
  user:
  password:
  database:

Snowflake

Install the following package:

pip install -i https://pypi.dev.sodadata.io/simple -U soda-snowflake

Data source YAML

type: snowflake
name: snowflake
connection:
  host:
  account:
  user:
  password:
  database:

Databricks

Install the following package:

pip install -i https://pypi.dev.sodadata.io/simple -U soda-databricks

Data source YAML

type: databricks
name: databricks
connection:
  host:
  http_path:
  catalog: "unity_catalog"
  access_token:
PreviousContract Language ReferenceNextWebhook API

Last updated 21 hours ago

Was this helpful?