Daco
Translators

DQX YAML

Translates JSON Schema constraints to DQX YAML quality checks (.yaml). The translator requires databricks-labs-dqx==0.8.0.

Example

Input (JSON Schema):

{
  "type": "object",
  "required": ["id", "status"],
  "properties": {
    "id": { "type": "string", "format": "uuid" },
    "status": { "type": "string", "enum": ["active", "inactive"] },
    "age": { "type": "integer", "minimum": 0, "maximum": 150 }
  }
}

Output (DQX YAML):

- criticality: error
  check:
    function: is_not_null
    arguments:
      column: id
- criticality: error
  check:
    function: regex_match
    arguments:
      column: id
      regex: ^[0-9a-fA-F]{8}-[0-9a-fA-F]{4}-[0-9a-fA-F]{4}-[0-9a-fA-F]{4}-[0-9a-fA-F]{12}$
- criticality: error
  check:
    function: is_not_null
    arguments:
      column: status
- criticality: error
  check:
    function: is_in_list
    arguments:
      allowed:
        - active
        - inactive
      column: status
- criticality: error
  check:
    function: sql_expression
    arguments:
      expression: "`age` IS NULL OR (`age` >= 0 AND `age` <= 150)"
      msg: age must be between 0 and 150

Supported JSON Schema Features

Type Keywords

  • type
  • enum
  • const

Type Values

  • string
  • integer
  • number
  • boolean
  • array
  • object
  • null

Schema Composition

  • allOf
  • anyOf
  • oneOf
  • not

Object Keywords

  • properties
  • required
  • additionalProperties
  • patternProperties
  • propertyNames
  • minProperties / maxProperties
  • unevaluatedProperties
  • dependentRequired

Array Keywords

  • items
  • prefixItems
  • contains
  • minItems / maxItems
  • uniqueItems
  • unevaluatedItems
  • maxContains / minContains

Numeric Validation

  • minimum / maximum
  • exclusiveMinimum / exclusiveMaximum
  • multipleOf

String Validation

  • minLength / maxLength
  • pattern

References & Definitions

  • $ref
  • $defs
  • $id
  • $anchor
  • $dynamicRef / $dynamicAnchor

String Formats

  • date
  • date-time
  • time
  • duration
  • uuid
  • uri / uri-reference / uri-template
  • iri / iri-reference
  • email / idn-email
  • hostname / idn-hostname
  • ipv4 / ipv6
  • json-pointer / relative-json-pointer
  • regex

Annotations

  • description
  • title
  • default
  • deprecated
  • readOnly / writeOnly
  • examples

Conditional

  • if / then / else
  • dependentSchemas

Content

  • contentEncoding
  • contentMediaType
  • contentSchema

On this page