Ship data products 10x faster

The standard way to define and manage your data products.

OpenDPI CLI Blog Docs

macOS/Linux

$ brew install dacolabs/tap/daco

OpenDPI Standard

Standardize how you define your data products in code. A single specification that works across all your technologies.

Platform Agnostic

One definition works across all technologies. Migrate between platforms without rewriting.

Automation Ready

Build pipelines and workflows that automatically generate code and docs

Quality Validation

Validate data products against their schemas before deployment

dataproduct.yaml

opendpi: "1.0.0"

info:
  title: "Customer Analytics"
  version: "2.1.0"
  description: "Aggregated customer behavior metrics"

connections:
  analytics_db:
    protocol: postgresql
    host: analytics.db.example.com
    variables:
      database: analytics
      schema: public

ports:
  daily_metrics:
    description: "Daily aggregated customer metrics"
    connections:
        - connection:
        $ref: "#/connections/analytics_db"
      location: customer_daily_metrics
    schema:
      type: object
      properties:
        customer_id:
          type: string
        date:
          type: string
          format: date
        total_orders:
          type: integer
        revenue:
          type: number

View OpenDPI Specification

Open Source CLI

Get started in seconds with our powerful command-line tool

terminal

$ daco init

? Data product name: Customer Analytics

? Version: 1.0.0

? Description: Aggregated customer metrics

✓ Created opendpi.yaml

$ daco ports add daily_metrics

? Connection: my_databricks_connection

? Location: prd.users.daily_metricks

✓ Added port daily_metrics

View on GitHub

Generate Code from Your Spec

Transform your OpenDPI specification into production-ready code for any framework

terminal

$ daco ports translate --format pyspark

Inputdataproduct.yaml

ports:
  daily_metrics:
    schema:
      type: object
      properties:
        customer_id:
          type: string
        date:
          type: string
          format: date
        total_orders:
          type: integer
        revenue:
          type: number

Outputschema.py

from pyspark.sql.types import *

daily_metrics_schema = StructType([
    StructField("customer_id", StringType()),
    StructField("date", DateType()),
    StructField("total_orders", IntegerType()),
    StructField("revenue", DoubleType()),
])

Latest from the Blog

January 27, 2025

Welcome to Daco

How two engineers went from endless alignment meetings to building a new standard for Data-as-Code.

January 25, 2025

Getting Started with OpenDPI

Learn how to define your first data product using the OpenDPI specification

January 20, 2025

Why Data Product Standards Matter

How standardization can transform your data engineering workflow

January 15, 2025

Deep Dive into the Daco CLI

Exploring all the powerful features of the Daco command-line tool