Metaphor and Soda Partner to Unify the Modern Data Stack with Trusted Data

We’re excited to announce our partnership with Soda, which brings powerful data quality metrics and insights to Metaphor.

Co-Founder & CEO
4
 min. read
December 14, 2021
Metaphor and Soda Partner to Unify the Modern Data Stack with Trusted Data

We’re excited to announce our partnership with Soda, which brings powerful data quality metrics and insights to the Metaphor platform. Soda’s data reliability tools and observability platform enable data teams to discover, prioritize, and collaboratively resolve data quality issues.

Uniting To Solve Common Problems

Metaphor + Soda helps data teams address some of the most common problems and challenges that they encounter every day:

  • Are datasets healthy and up-to-date?
  • Is the dataset unique?
  • Is this data fit for the purpose?
  • Are we delivering trusted data to the business?
Integrating Soda into the Metaphor platform enables data producers and consumers to work confidently together on the quality of the data that matters most to them.

Trust in the Data Products

Data-driven organizations need access to reliable, trustworthy data to make business-critical decisions. However, many organizations lack the ability to detect problems and catch data quality issues, leaving systems vulnerable to severe downstream problems.

Quality and reliability are critically important technical indicators that signal data trustworthiness to data teams. Soda’s data reliability tools facilitate data quality work across the entire data product lifecycle. Soda supports every on-premise and cloud data workload, including data infrastructure, science, analysis, and streaming workloads.

Data teams need a scalable, automated way to detect changes, anomalies, or problems within datasets. Soda’s intelligent features, time series anomaly detection, and schema evolution monitoring allow teams to automatically monitor the quality of the data, using dimensions such as completeness and consistency.

These automated, out-of-the-box monitoring features require no configuration. However, with an easy-to-use Domain Specific Language, data engineers can define their own critical tests in Soda Cloud Monitors to ensure no bad data is flowing through their data pipelines.

Testing for data quality is often an exercise in relativity. With each extraction, transformation, or load, data can shift, change unintentionally, or become unhealthy which can negatively impact data products and dilute the trust of data across the organization.

Soda helps data teams identify root cause issues by uncovering data issues at the right time and to the right people in the effort to provide the highest possible data quality. Data and analytics engineers can continue to ensure reliable data pipelines by testing data every time it is transformed. Using Soda data quality test results, data teams can easily stop production pipelines and quarantine bad data.

Data product managers can validate data and align with data consumers on what’s important, what’s expected, and what to measure so that the data remains fit for purpose.


Data Reliability Delivered with Metaphor + Soda

Integrating Soda with Metaphor puts Soda Monitors in the data catalog to provide data quality insights that help Metaphor users understand and trust their data.

Users can examine the schema of a particular dataset to see the results of the tests that Soda has executed to determine if that data is reliably accurate and fit for its purpose.

A schema table view of 'Public Orders' in Metaphor, showing which Soda data quality tests have passed

Users can access Soda Cloud from the Metaphor catalog to examine the detail of each data quality rule as defined for each dataset. They can quickly review the monitor’s test results over time and use additional diagnostic information to analyze the root cause of a data issue.  

From Metaphor to Soda in one click so users can drill down to analyze a failed data quality monitor

See the Full Data Picture

This partnership combines Soda’s mission to bring everyone closer to the data with Metaphor’s approach to helping data teams navigate their data landscape.

Metaphor + Soda gives data teams a common language to help users of every skill level easily identify and deliver timely, accurate, and complete data across the data product lifecycle.

We are excited to partner with the Soda Team to further simplify and scale the journey to data value.

A Data Catalog That Works For Your Data: In Action

This video demonstrates Metaphor + Soda in action. In this brief overview, Milan Lukac, a Data Engineer at Soda, introduces the integration through the eyes of a data analyst who has been tasked with creating a report for sales orders.

Using Metaphor, the analyst can quickly search the data catalog and discover and examine the metadata related to the ‘Orders’ dataset. Using Metaphor’s Knowledge Cards, the analyst knows:

  • who ingested the data into the table
  • if there have been any data quality incidents
  • where this dataset is used across the business
  • how business-critical the data is to the organization.  
  • business context of the data.

Using the ‘Schema’ tab, the analyst can examine the metadata and determine if he can use it for the sales orders report he is preparing. In order to check the quality of the data, based on the completeness and consistency of the columns, the analyst uses the ‘Soda.io’ tab to access the configured data quality tests.

The analyst sees an overview of all of the scheduled Monitors, including the type of  data quality check, the column it relates to, the owner of the Monitor, and whether the data quality test has passed or failed.

The Metaphor + Soda integration helps accelerate root cause analysis. The analyst and his team can identify data issues, understand how data has been evolving over time, and quickly and efficiently resolve issues and prevent future occurrences.  

Data quality is a team sport and everyone who has a stake in the data needs to be able to understand it, trust it, and maintain it. The Metaphor + Soda integration gives data teams one central platform to facilitate data transparency and provide end-to-end observability that data teams need to create trusted data products.

What’s Coming Next?

The Metaphor and Soda teams are already working on the next phase of our integration that will help increase trust and deliver more value from your data. Our next step is to improve how users identify data issues by adding traffic light indicators on lineage graphs.


Let’s Operationalize Your Data Catalog

To deliver data management capabilities for the modern data stack, organizations must have both visibility into high-quality, active, or operational technical metadata and the ability to tap into social indicators of how people are talking about data in the language of the business. An active, trusted data catalog ensures that teams can quickly take action and easily resolve issues.

With Metaphor + Soda, data teams can explore, discover, and optimize trusted data.


Get Started

If you’d like to see Metaphor + Soda in action and learn more about how we can help your organization better manage the data product lifecycle, get in touch with a member of our team.

About Metaphor

The Metaphor Metadata Platform represents the next evolution of the Data Catalog - it combines best in class Technical Metadata (learnt from building DataHub at LinkedIn) with Behavioral and Social Metadata. It supercharges an organization’s ability to democratize data with state of the art capabilities for Data Governance, Data Literacy and Data Enablement, and provides an extremely intuitive user interface that turns even the most non-technical user into a fan of the catalog. See Metaphor in action today!