Translators
DQX YAML
Translates JSON Schema constraints to DQX YAML quality checks (.yaml). The translator requires databricks-labs-dqx==0.8.0.
Example
Input (JSON Schema):
{
"type": "object",
"required": ["id", "status"],
"properties": {
"id": { "type": "string", "format": "uuid" },
"status": { "type": "string", "enum": ["active", "inactive"] },
"age": { "type": "integer", "minimum": 0, "maximum": 150 }
}
}Output (DQX YAML):
- criticality: error
check:
function: is_not_null
arguments:
column: id
- criticality: error
check:
function: regex_match
arguments:
column: id
regex: ^[0-9a-fA-F]{8}-[0-9a-fA-F]{4}-[0-9a-fA-F]{4}-[0-9a-fA-F]{4}-[0-9a-fA-F]{12}$
- criticality: error
check:
function: is_not_null
arguments:
column: status
- criticality: error
check:
function: is_in_list
arguments:
allowed:
- active
- inactive
column: status
- criticality: error
check:
function: sql_expression
arguments:
expression: "`age` IS NULL OR (`age` >= 0 AND `age` <= 150)"
msg: age must be between 0 and 150Supported JSON Schema Features
Type Keywords
- type
- enum
- const
Type Values
- string
- integer
- number
- boolean
- array
- object
- null
Schema Composition
- allOf
- anyOf
- oneOf
- not
Object Keywords
- properties
- required
- additionalProperties
- patternProperties
- propertyNames
- minProperties / maxProperties
- unevaluatedProperties
- dependentRequired
Array Keywords
- items
- prefixItems
- contains
- minItems / maxItems
- uniqueItems
- unevaluatedItems
- maxContains / minContains
Numeric Validation
- minimum / maximum
- exclusiveMinimum / exclusiveMaximum
- multipleOf
String Validation
- minLength / maxLength
- pattern
References & Definitions
- $ref
- $defs
- $id
- $anchor
- $dynamicRef / $dynamicAnchor
String Formats
- date
- date-time
- time
- duration
- uuid
- uri / uri-reference / uri-template
- iri / iri-reference
- email / idn-email
- hostname / idn-hostname
- ipv4 / ipv6
- json-pointer / relative-json-pointer
- regex
Annotations
- description
- title
- default
- deprecated
- readOnly / writeOnly
- examples
Conditional
- if / then / else
- dependentSchemas
Content
- contentEncoding
- contentMediaType
- contentSchema