Standardize how you define your data products in code. A single specification that works across all your technologies.
One definition works across all technologies. Migrate between platforms without rewriting.
Build pipelines and workflows that automatically generate code and docs
Validate data products against their schemas before deployment
opendpi: "1.0.0"
info: title: "Customer Analytics" version: "2.1.0" description: "Aggregated customer behavior metrics"
connections: analytics_db: protocol: postgresql host: analytics.db.example.com variables: database: analytics schema: public
ports: daily_metrics: description: "Daily aggregated customer metrics" connections: - connection: $ref: "#/connections/analytics_db" location: customer_daily_metrics schema: type: object properties: customer_id: type: string date: type: string format: date total_orders: type: integer revenue: type: number
Get started in seconds with our powerful command-line tool
Transform your OpenDPI specification into production-ready code for any framework
ports: daily_metrics: schema: type: object properties: customer_id: type: string date: type: string format: date total_orders: type: integer revenue: type: number
from pyspark.sql.types import * daily_metrics_schema = StructType([ StructField("customer_id", StringType()), StructField("date", DateType()), StructField("total_orders", IntegerType()), StructField("revenue", DoubleType()), ])