Be first to try Soda's new AI-powered metrics observability, and collaborative data contracts.
Try Soda Now!
LogoLogo
  • What is Soda?
  • Quickstart
  • Data Observability
    • Metric Monitoring dashboard
      • Dataset monitors
      • Column monitors
    • Metric monitor page
  • Data Testing
    • Git-managed Data Contracts
      • Install and Configure
      • Create and Edit Contracts
      • Verify a contract
    • Cloud-managed Data Contract
      • Author a Contract in Soda Cloud
      • Verify a contract
  • Onboard datasets on Soda Cloud
  • Manage Issues
    • Organization dashboard
    • Browse Datasets
    • Dataset dashboard
    • Browse Checks
    • Check and dataset attributes
    • Analyze monitor and check results
    • Notifications
    • Incidents
  • Dataset Attributes & Responsibilities
  • Deployment options
    • Deploy Soda Agent
      • Deploy a Soda Agent in a Kubernetes cluster
      • Deploy a Soda Agent in an Amazon EKS cluster
      • Deploy a Soda Agent in an Azure AKS cluster
      • Deploy a Soda Agent in a Google GKE cluster
      • Soda Agent Extra
  • Organization and Admin Settings
    • General Settings
    • User management
    • User And User Group Management with SSO
    • Global and Dataset Roles
    • Integrations
  • Integrations
    • Alation
    • Atlan
    • Metaphor
    • Purview
    • Jira
    • ServiceNow
    • Slack
    • MS Teams
    • Webhook
  • Reference
    • Generate API keys
    • Python API
    • CLI Reference
    • Contract Language Reference
    • Data source reference for Soda Core
    • Rest API
    • Webhook API
Powered by GitBook
On this page
  • Prerequisites
  • System requirements
  • Deploy an agent
  • Deploy using CLI only
  • Deploy using a values YAML file
  • Decomission the Soda Agent and cluster
  • Troubleshoot deployment

Was this helpful?

Export as PDF
  1. Deployment options
  2. Deploy Soda Agent

Deploy a Soda Agent in a Kubernetes cluster

PreviousDeploy Soda AgentNextDeploy a Soda Agent in an Amazon EKS cluster

Last updated 7 days ago

Was this helpful?

Prerequisites

  • You have created, or have access to an existing Kubernetes cluster into which you can deploy a Soda Agent.

  • You have installed v1.22 or v1.23 of . This is the command-line tool you use to run commands against Kubernetes clusters. If you have installed Docker Desktop, kubectl is included out-of-the-box. With Docker running, use the command kubectl version --output=yaml to check the version of an existing install.

  • You have installed . This is the package manager for Kubernetes which you will use to deploy the Soda Agent Helm chart. Run helm version to check the version of an existing install.

System requirements

Kubernetes cluster size and capacity: 2 CPU and 2GB of RAM. In general, this is sufficient to run up to six scans in parallel.

Scan performance may vary according to the workload, or the number of scans running in parallel. To improve performance for larger workloads, consider fine-tuning the cluster size using the resources parameter for the agent-orchestrator and soda.scanlauncher.resources for the scan-launcher. Adding more resources to the scan-launcher can improve scan times by as much as 30%. Be aware, however, that allocating too many resources may be costly relative to the small benefit of improved scan times.

To specify resources, add the following parameters to your values.yml file during deployment. Refer to Kubernetes documentation for for information on values to supply for x.

soda:
  agent:
    resources:
      limits:
        cpu: x
        memory: x
      requests:
        cpu: x
        memory: x
  scanlauncher:
    resources:
      limits:
        cpu: x
        memory: x
      requests:
        cpu: x
        memory: x

For reference, a Soda-hosted agent specifies resources as follows:

soda:
  agent:
    resources:
      limits:
        cpu: 250m
        memory: 375Mi
      requests:
        cpu: 250m
        memory: 375Mi

Deploy an agent

The following table outlines the two ways you can install the Helm chart to deploy a Soda Agent in your cluster.

Method
Description
When to use

Install the Helm chart via CLI by providing values directly in the install command.

Use this as a straight-forward way of deploying an agent on a cluster in a secure or local environment.

Install the Helm chart via CLI by providing values in a values YAML file.

Deploy using CLI only

  1. Add the Soda Agent Helm chart repository.

    helm repo add soda-agent https://helm.soda.io/soda-agent/
    • Replace the value of soda.agent.name with a custom name for you agent, if you wish.

    • Specify the value for soda.cloud.endpoint according to your local region: https://cloud.us.soda.io for the United States, or https://cloud.soda.io for all else.

    • (Optional) Specify the format for log output: raw for plain text, or json for JSON format.

    • (Optional) Specify the level of log information you wish to see when deploying the agent: ERROR, WARN, INFO, DEBUG, or TRACE.

      helm install soda-agent soda-agent/soda-agent \
       --set soda.agent.name=myuniqueagent \
       # Use https://cloud.us.soda.io for US region; use https://cloud.soda.io for EU region
       --set soda.cloud.endpoint=https://cloud.soda.io \
       --set soda.apikey.id=*** \
       --set soda.apikey.secret=**** \
       --set soda.agent.logFormat=raw \
       --set soda.agent.loglevel=ERROR \
       --namespace soda-agent

      The command-line produces output like the following message:

      NAME: soda-agent
      LAST DEPLOYED: Thu Jun 16 15:03:10 2022
      NAMESPACE: soda-agent
      STATUS: deployed
      REVISION: 1
  2. (Optional) Validate the Soda Agent deployment by running the following command:

    minikube kubectl -- describe pods
  3. In your Soda Cloud account, navigate to your avatar > Agents. Refresh the page to verify that you see the agent you just created in the list of Agents. Be aware that this may take several minutes to appear in your list of Soda Agents. Use the describe pods command in step 3 to check the status of the deployment. When State: Running and Ready: True, then you can refresh and see the agent in Soda Cloud.

    ...
    Containers:
      soda-agent-orchestrator:
         Container ID:   docker://081*33a7
         Image:          sodadata/agent-orchestrator:latest
         Image ID:       docker-pullable://sodadata/agent-orchestrator@sha256:394e7c1**b5f
         Port:           <none>
         Host Port:      <none>
         State:          Running
           Started:      Thu, 16 Jun 2022 15:50:28 -0700
         Ready:          True
         ...

If you do no see the agent listed in Soda Cloud, use the following command to review status and investigate the logs.

kubectl logs -l agent.soda.io/component=orchestrator -n soda-agent -f

Deploy using a values YAML file

  1. Create or navigate to an existing Kubernetes cluster in your environment in which you can deploy the Soda Agent helm chart.

  2. Using a code editor, create a new YAML file called values.yml.

  3. In that file, copy+paste the content below, replacing the following values:

    • Replace the value of name with a custom name for your agent, if you wish.

    • Specify the value for endpoint according to your local region: https://cloud.us.soda.io for the United States, or https://cloud.soda.io for all else.

    • (Optional) Specify the format for log output: raw for plain text, or json for JSON format.

    • (Optional) Specify the level of log information you wish to see when deploying the agent: ERROR, WARN, INFO, DEBUG, or TRACE.

      soda:
         apikey:
           id: "***"
           secret: "***"
         agent:
           name: "myuniqueagent"
           logformat: "raw"
           loglevel: "ERROR"
         cloud:
           # Use https://cloud.us.soda.io for US region
           # Use https://cloud.soda.io for EU region
           endpoint: "https://cloud.soda.io"
  4. Save the file. Then, in the same directory in which the values.yml file exists, use the following command to install the Soda Agent helm chart.

    helm install soda-agent soda-agent/soda-agent \
      --values values.yml \
      --namespace soda-agent
  5. (Optional) Validate the Soda Agent deployment by running the following command:

    minikube kubectl -- describe pods
  6. In your Soda Cloud account, navigate to your avatar > Agents. Refresh the page to verify that you see the agent you just created in the list of Agents. Be aware that this may take several minutes to appear in your list of Soda Agents. Use the describe pods command in step three to check the status of the deployment. When State: Running and Ready: True, then you can refresh and see the agent in Soda Cloud.

    ...
    Containers:
      soda-agent-orchestrator:
     Container ID:   docker://081*33a7
     Image:          sodadata/agent-orchestrator:latest
     Image ID:       docker-pullable://sodadata/agent-orchestrator@sha256:394e7c1**b5f
     Port:           <none>
     Host Port:      <none>
     State:          Running
       Started:      Thu, 16 Jun 2022 15:50:28 -0700
     Ready:          True
    ...

If you do no see the agent listed in Soda Cloud, use the following command to review status and investigate the logs.

kubectl logs -l agent.soda.io/component=orchestrator -n soda-agent -f

About the helm install command

helm install soda-agent soda-agent/soda-agent \
  --set soda.agent.name=myuniqueagent \
  --set soda.apikey.id=*** \
  --set soda.apikey.secret=**** \
  --namespace soda-agent
Command part
Description

helm install

the action helm is to take

soda-agent (the first one)

a release named soda-agent on your cluster

soda-agent (the second one)

the name of the helm repo you installed

soda-agent (the third one)

the name of the helm chart that is the Soda Agent

The --set options either override or set some of the values defined in and used by the Helm chart. You can override these values with the --set files as this command does, or you can specify the override values using a values.yml file.

Parameter key
Parameter value, description

--set soda.agent.name

A unique name for your Soda Agent. Choose any name you wish, as long as it is unique in your Soda Cloud account.

--set soda.apikey.id

With the apikey.secret, this connects the Soda Agent to your Soda Cloud account. Use the value you copied from the dialog box in Soda Cloud when adding a new agent. You can use a values.yml file to pass this value to the cluster instead of exposing it here.

--set soda.apikey.secret

With the apikey.id, this connects the Soda Agent to your Soda Cloud account. Use the value you copied from the dialog box in Soda Cloud when adding a new agent. You can use a values.yml file to pass this value to the cluster instead of exposing it here.

--set soda.agent.logFormat

(Optional) Specify the format for log output: raw for plain text, or json for JSON format.

--set soda.agent.loglevel

(Optional) Specify the level of log information you wish to see when deploying the agent: ERROR, WARN, INFO, DEBUG, or TRACE.

--namespace soda-agent

Use the namespace value to identify the namespace in which to deploy the agent.

Decomission the Soda Agent and cluster

  1. Uninstall the Soda Agent in the cluster.

    helm uninstall soda-agent -n soda-agent
  2. Delete the cluster.

    minikube delete
    💀  Removed all traces of the "minikube" cluster.

Troubleshoot deployment

Problem: After setting up a cluster and deploying the agent, you are unable to see the agent running in Soda Cloud.

Solution: The value you specify for the soda-cloud-enpoint must correspond with the region you selected when you signed up for a Soda Cloud account:

  • Usehttps://cloud.us.soda.io for the United States

  • Use https://cloud.soda.io for all else

Problem: You need to define the outgoing port and IP address with which a self-hosted Soda Agent can communicate with Soda Cloud. Soda Agent does not require setting any inbound rules as it only polls Soda Cloud looking for instruction, which requires only outbound communication. When Soda Cloud must deliver instructions, the Soda Agent opens a bidirectional channel.

Solution: Use port 443 and passlist the fully-qualified domain names for Soda Cloud:

  • cloud.us.soda.io for Soda Cloud account created in the US region OR

  • cloud.soda.io for Soda Cloud account created in the EU region AND

  • collect.soda.io

Use this as a way of deploying an agent on a cluster while keeping sensitive values secure. - provide sensitive API key values in this local file - store data source login credentials as environment variables in this local file or in an external secrets manager; Soda needs access to the credentials to be able to connect to your data source to run scans of your data. See

Use the following comand to install the Helm chart to deploy a Soda Agent in your custer. Learn more about the .

Replace the values of soda.apikey.id and soda-apikey.secret with the values you copy+pasted from the New Soda Agent dialog box in your Soda Cloud account. By default, Soda uses as part of the Soda Agent deployment. The agent automatically converts any sensitive values you add to a values YAML file, or directly via the CLI, into Kubernetes Secrets.

agent-deployed

id and secret with the values you copy+pasted from the New Soda Agent dialog box in your Soda Cloud account. By default, Soda uses as part of the Soda Agent deployment. The agent automatically converts any sensitive values you add to a values YAML file, or directly via the CLI, into Kubernetes Secrets.

agent-deployed

If you use private key authentication with a Soda Agent, refer to .

kubectl
Helm
Resource Management for Pods and Containers
helm install command
Kubernetes Secrets
Kubernetes Secrets
Soda Agent extras
Soda Agent Extra
Deploy using CLI only
Deploy using a values YAML file