Taming the Beast: Comparing Jsonnet, Dhall, Cue

I'm currently retreating, having wrestled with a huge beast on the other side of my screen. I keep peeking at it, hoping it goes away.

$ wc -l workflow.json
9118

The beast also has multiple cousins residing in their corner of the repository. Each day, I dread picking up tasks that would have me approach them. However, I've decided to confront this beast head-on and attempt to tame it.

Throughout the forest, there are whispers of configuration languages designed to tame these beasts. Today, we'll explore three such languages: Jsonnet, Dhall, and Cue.

The Workflow in detail

Dealing with big balls of JSON (while sometimes not fun) shouldn't be intimidating, so why do I feel shaken? The file describes an E-Commerce workflow for a SaaS tool, covering the dance needed to take an order from creation to fulfillment, including cancellations, returns, and other side-steps. And to keep this dance from tumbling over when more complexities arise the JSON structure encodes business logic and event-based control flow for editing in a visual interface.

Control flow in JSON? Imagine your favorite continuous integration workflow files (like GitHub Actions or Gitlab CI YAML) but with the flexibility to GOTO any step and trigger steps externally via an API. In fairness, I'm making it sound worse than it is, but it's still an experience to work with.

Before diving into the specifics of each language, let's take a glance at the workflow definition. I've simplified it, but it remains a realistic representation.

A workflow describes the process for a core entity, like an ORDER. It contains Rulesets that execute Rules, tied to (possibly) different entities. Each entity has a status (e.g., CREATED, BOOKED, SHIPPED), and most Rulesets aim to progress the order through these statuses.

Rules are self-contained actions. They change the entity's state, trigger events, or update data conditionally. Think of them as named function calls with properties.

Here's a brief example a workflow for an ORDER.

{
  "id": null,
  "name": "ORDER::HD",
  "description": "A tiny Workflow",
  // Root entity of the workflow, however rulesets can reference other entities
  "entityType": "ORDER",
  "entitySubType": "HD",
  "version": "1.10001",
  "createdOn": "2023-05-19T14:34:17.231+00:00",
  // A workflow defines the valid statuses for each entity
  "statuses": [
    { "name": "CREATED", "entityType": "ORDER", "category": "BOOKING" },
    { "name": "PACKING", "entityType": "ORDER", "category": "DELIVERY" },
    { "name": "SHIPPED", "entityType": "ORDER", "category": "DELIVERY" },
    { "name": "COMPLETE", "entityType": "ORDER", "category": "DONE" }
  ],
  // A Workflow consists of Rulesets, which can be triggered by events
  "rulesets": [
    {
      "name": "CREATE",
      "description": "CREATE & Validate order's delivery address",
      "type": "ORDER",
      "eventType": "NORMAL",
      "rules": [
        {
          "name": "wtf.pv.common.ChangeState",
          "props": { "status": "PENDING" }
        },
        {
          "name": "wtf.pv.common.SendEvent",
          "props": { "eventName": "PackOrder" }
        }
      ],
      "triggers": [
        { "status": "CREATED" }
      ],
      "userActions": []
    }
  ]
}

Now this might not look too bad, even trivial to handle. But imagine scrolling through one with 84 different statuses in 253 rulesets, each containing 2-10 different rules that trigger events in completely different parts of the file.

Since the SaaS tool defines the structure we cannot make the data model simpler. Can we make it more maintainable in other ways?

Certain fields have constraints, statuses must exist and rules triggering events must reference valid Rulesets. We can validate this and prevent errors. The workflow can be divided into distinct stages like "creating and validating the order", "shipping the order", and "handling returns". We can split the file into smaller chunks representing those stages.

But I'm the kind of person that gets easily excited about clever solutions. And the solution needs to be one that colleagues, both present and future, will want to keep.

Configuration Languages

Our goal is two-fold: to tame the complexity of this workflow file and explore different options for conquering the configuration challenge in general.

Configuration is a vast realm, squint hard enough and you'll find many inhabitants there. Environment variables set by any number of sources, dependencies stored in package.json/go.mod/cargo.toml/mix.exs and their lock-files, feature flags and their cousins AB-tests, authorization policies defined by rows in a database or Rego code. Of course we cannot forget the heaps of yaml fueling Kubernetes, Infrastructure as Code, and the dance of CI/CD workflows.

They all span a spectrum of being some kind of serialized data structures, to describing application behavior controlled from outside of it.

Jsonnet, Dhall, and Cue all want to make your life a bit easier by at least bringing a bit more structure to the yaml and json files of that equation. By reducing duplication, making it possible to extract different parts, and making it harder for mistakes to occur. They all bring something different though, and arrive at their solutions to these problems in very different ways.

One way they set themselves apart from programming languages is by being side-effect free. This means that they constrict themselves in different ways to avoid hanging, crashing, leaking secrets, or otherwise compromising the host system. There might be a way to import a remote module, but there's not going to be an http.get(url) function.

Most importantly, they allow those ephemeral whispers that guide and enlighten we call comments. Yes, comments!

Jsonnet

Jsonnet (pronounced jay-sonnet), was released to the world in 2014 after having been born from the creative minds inside Google as a 20% project. It is closely based on an earlier internal language named GCL (Google Configuration Language).

From my understanding, the main difference is that Jsonnet shaves off some of the sharp edges. For example, in GCL you can refer to the attributes of a parent object with an easy (and easy to get wrong) this.up.up.up. Jsonnet also simplifies the scoping rules.

If you are curious about GCL (or BCL, for their internal Kubernetes predecessor Borg), I can't really help, but it is described in a paper from TUM called a study in improving the understanding of GCL programs.

Jsonnet extends JSON syntax, adding variables, conditionals, and imports. The syntax for these the additions should be familiar to developers versed in mainstream languages of recent vintage.

But enough backstory! Let's delve right into it and craft ourselves some rules.

local namespace = 'wtf.pv';
local Rule(name, props={}) = {
  name: namespace + '.' + name,
  props: props,
};

// Last object in the file will be evaluated
{
  ChangeState: Rule('changeState', {
    status: "PENDING"
  }),
}
$ jsonnet rule.jsonnet
{ 
  "ChangeState": {
    "name": "wtf.pv.changeState",
    "props": {
      "status": "PENDING"
    }
  }
}

To build strings we can use the + operator, or std.format(str, vals) which uses the same % formatting rules as found in other languages. I would have liked to see "Hello ${var}"-style string interpolation as well, but it's not a big loss. It's a dynamic language and both variables and functions use the identifier local.

If we wanted to create a more specific version of the Rule function to make it easier to add a ChangeState rule we could do it by merging the output with an object of our own. Jsonnet has multiple ways of inheriting or merging objects or lists which we can use.

// Import the Rule definition we wrote before
local Rule = import './rule.libsonnet';

// The '+' operator merges objects and lists as well as strings
local ChangeState(status) = Rule('ChangeState') + {
  // field separators +: and +:: allow us to merge or inherit nested objects
  props+: { status: status },
};

In this example, we also the previously defined Rule from another file. This is what's going to allow us to split the workflow into smaller parts later.

For iteration, we can use list comprehensions. To show this let's define a function for Rulesets that iterates through it's Rules.

local RuleSet(name, description, type='ORDER', rules=[], statuses=[], userActions=[]) = {
  name: name,
  description: description,
  type: type,
  rules: rules,
  // Use a list comprehension to format our statues into nested objects
  triggers: [{ status: status } for status in statuses],
  userActions: userActions,
};

List comprehensions can be used for filtering by adding an if. Similar to how it works in Python. While they should be familiar to a broad audience today, I hold a fondness to the functional trinity of map()/filter()/reduce() instead. Curiously all three languages we are looking at today prefer list-comprehensions. I'm wondering if this has something to do with their goals of constraining computation to be side-effect free. Probably not, as you'd find std.flatMap in the Jsonnet standard library, though it's less ergonomic to use.

Now, let us employ these tricks to revisit the main workflow file.

local rules = import './lib.libsonnet';
local statuses = import './statuses.libsonnet';
local moreRules = import './rules.libsonnet';

{
  "id": null,
  // Reduce duplication by using fields from the object
  name: self.entityType + '::' + self.entitySubtype,
  description: 'Tiny Workflow for testing',
  entityType: 'ORDER',
  entitySubtype: 'HD',
  statuses: [
    // The standard library includes most of what you'd expect
    [status for status in std.objectValues(entities)] 
    for entities in std.objectValues(statuses)
  ],
  rulesets: [
    rules.RuleSet(
      name='CREATE',
      description="Create an Order",
      rules=[
        rules.ChangeState(statuses.order.pending.name),
        rules.Rule(name='SendEvent', props={ eventName: 'ShipOrder' }),
      ],
      statuses=[statuses.order.created.name]
    ),
  // Concatenate multiple arrays
  ] + [moreRules]
}
// rules.libsonnet
local namespace = 'wtf.pv';
local Rule(name, props={}) = {
  name: namespace + '.' + name,
  props: props,
};

// Calling the Rule function, but merging objects instead of passing a props argument
local ChangeState(status) = Rule('ChangeState') + {
  props+: { status: status },
};

local RuleSet(name, description, type='ORDER', rules=[], statuses=[], userActions=[]) = {
  name: name,
  description: description,
  type: type,
  rules: rules,
  triggers: [{ status: status } for status in statuses],
  userActions: userActions,
};

// Whatever that's last in the file will be evaluated, or in this case exported.
{
  RuleSet: RuleSet,
  Rule: Rule,
  ChangeState: ChangeState,
}

Certain fields, like statuses, can be generated from other data reducing the likelihood of a typo. Due to the dynamic nature of the language we can't constrain the values further though. While code review would likely uncover any typos trying to sneak in, there's nothing preventing us from calling rules.ChangeState("I'm not a status"). At points, this made my heart yearn for an Enum or two.

Similarly, it would be nice to validate the SendEvent call to ShipOrder. This wouldn't be impossible if the Rulesets were an object like we've done with statuses. But we also want to split Rulesets across files, and combining both would become a bit tricky.

Adding a type system to handle that, even a gradual one, is not the kind of thing you can add in easily though and it would also increase the complexity of the language a fair amount. I would be very surprised if that was added at any point in the future. My biggest want would be some better way to validate the data. Functions get you a lot of mileage, but because of the lack of types, it is not that hard to call functions with the wrong arguments, either the wrong type or in the or just containing a typo.

It's a dilemma you might solve by running json-schema on the output (or opa/rego!), but then we have one more tool to learn.

Looking at the result, we've been able to condense the main ruleset section by a fair amount. The most substantial gains arise from modularizing and importing fragments from different files. Using functions and their default arguments come in as a runner-up. When it comes to referencing global attributes namespace, Jsonnet represents a step above templating languages like handlebars and and similar.

For those wanting to learn more, I found the official website pretty good. Start going through the tutorial and then read about some of the design choices. Originally Jsonnet was implemented in c++, but there are now a couple of implementations in other languages like golang (go-jsonnet) and rust (jrsonnet) that are both more performant and easier to embed into an application.

The ecosystem is also growing around it, another big configuration pain point for a lot of people today is their Kubernetes manifests and piles of yaml for Helm, Kustomize, or similar. As an alternative here's how you could use Grafana Tanka to define a k8s deployment in jsonnet.

local k = import "k.libsonnet";

{
    grafana: k.apps.v1.deployment.new(
        name="grafana",
        replicas=1,
        containers=[k.core.v1.container.new(
            name="grafana",
            image="grafana/grafana",
        )]
    )
}

That looks rather nice compared to the yaml that's usually required. Most of the developer tooling you expect from a modern language is also there for jsonnet. We get syntax highlighting, Language Server support, decent documentation (which covers quite a bit more than I described above), and when something goes wrong also decent errors. Here's an example of me missing an import

variable is not defined: anotherRule
    jsonnet/main.jsonnet:50:8-20: variable <anotherRule>
                                : field <rulesets> manifestification

Being a superset of JSON creates a big advantage for the language, as it makes defining objects and operating on them quite natural and easy. If you want to output another format than JSON, that's also possible as long as it could be represented with JSON and the standard library has a few functions to help with that (std.manifestIni, std.manifestYaml).

I would be happy to recommend Jsonnet to a curious developer looking to tame their configuration beast.

Yet, my mind drifts back to the realm of types, oh, the possibilities! How would the world of configuration look like if every variable element was adorned with its own type?

Dhall

Dhall (rhymes with tail-call and hall), originating from the Haskell ecosystem which has a clear influence on its syntax (don't run away yet!). From what I can find it was announced in 2016 making it a little bit newer than our other two contenders, but not by much.

Let's dive into an example right away, and define types for order statuses:

-- Define a union type for the different statuses, use uppercase for serialization
let Status = < CREATED | PACKING | SHIPPING | COMPLETED >

-- Define a mapping for serializing them to text
let StatusMapping =
      { CREATED = "CREATED"
      , PACKING = "PACKING"
      , SHIPPING = "SHIPPING"
      , COMPLETED = "COMPLETED"
      }

-- Define a function that takes a Status and returns the string version of it
let Status/Show = \(status : Status) -> merge StatusMapping status

-- Return a Record with the type itself and the serialization function
in  { Status, Status/Show }

Here, we see the let X in Y syntax style familiar from Haskell. If you are annoyed with the leading commas, this is how the formatter likes it and that's as far as I care. I'm not the biggest fan of the anonymous function syntax \(arg : Type) -> either, but it's all fine. Interestingly you can also write it with a Unicode lambda instead of the backslash λ(arg : Type) -> , but this is an opportunity I'm going to pass on until I get a full unicode keyboard. The type union syntax should be familiar to Typescripters or Rustlers, even if we have some extra arrows laying around.

There's a no built-in way to easily stringify a value á la show/display/toString from what I saw traversing the documentation. This becomes a bit annoying later, but for now, let's define the function Status/Show to do that.

We are not done typing yet! We also need the rest of the entities covered. Let us take a look at how we'd create a type for the Rules first.

-- A relative import
let enum = ./enum.dhall
let namespace = "wtf.pv"

-- Props has dynamic keys/values, type as a list of key-value pairs
let Props : Type = List { mapKey : Text, mapValue : Text }

-- We can set default values using a record with a 'default' key,
-- note that we need to type the empty list as well
let Rule = { Type = { name : Text, props : Props }
           , default.props = [] : Props }

-- Util function to create a Rule record, serializing the Status to text
let ChangeState =
      \(status : enum.Status) ->
        Rule::{
        , name = "${namespace}.ChangeState"
        , props = toMap { status = enum.Status/Show status }
        }

in { Rule, ChangeState }

Now that we have a prim and proper type system we can't have dynamic objects with arbitrary keys/values anymore, which affects how we represent the props objects. There's no built-in dict/map type, but if we specify it as a list of key-value pairs [{mapKey = "a", mapValue = "1"}, {mapKey = "b", mapValue = "2"}] Dhall will serialize it to the kind of JSON value we expect {"a": "1", "b": "2"}.

And we'd also like to have a default value for the prop arguments, so we type the Rule as a record with the Type as one key and default. This Dhall will again recognize by convention. Since default.props is just another record, I understand why we need to provide a type for it again, but I would love to have that be inferred instead.

If you remember above we added Status/Show to stringify statuses, which comes in handy to use it as a value in the props record. We could type mapValue : < Text | Status > but then we have to deal with each alternative explicitly (mapValue.Status) or use the merge function which becomes annoying. Since we have the magic default key in records, couldn't we also have a toText key for the conversion? That would possibly not be generic enough, and the "proper" solution would involve generics of some kind (traits, typeclasses, interfaces, or the like).

The ChageState function only requires a single argument to return a Rule. Much like some functional languages Dhall does not support multiple arguments, and that has to instead be solved using currying or passing a record. Given the lineage of the language, this is not very surprising. However, I would expect it to be surprising for developers used to more mainstream languages.

-- Multiple arguments via currying
let curriedExample = \(x : Bool) -> \(y : Bool) -> [ x, y ]
-- Multiple arguments via records
let recordExample = \(args : { x : Bool, y : Bool }) -> [ args.x, args.y ]

Importing files is easy in Dhall, whether from the filesystem or over the internet. After having been downloaded once the dependencies are cached locally and the network is no longer required. Most commonly used for the Prelude (think of it as an expanded standard library), it makes it easy to share code and is a pretty good solution for a small language that doesn't want to maintain infrastructure for a package manager.

But with the language having a focus on being side-effect free and isolated as much as possible you might think running arbitrary code from the internet departs from that. Well, running dhall freeze will annotate each import with a SHA hash so that you can be sure you're running the code you expect.

-- File import
let enum = ./enum.dhall
-- Url import
let Prelude = https://prelude.dhall-lang.org/v15.0.0/package.dhall
-- Url import with frozen sha
let Prelude =
      https://prelude.dhall-lang.org/v15.0.0/package.dhall
      sha256:6b90326dc39ab738d7ed87b970ba675c496bed0194071b332840a87261649dcd
-- Url import of one function
let Natural/sum : List Natural -> Natural =
    https://prelude.dhall-lang.org/Natural/sum

Now we've walked through most of what we need to define the main workflow again.

let namespace = "pv.wtf"

let schema = ./schema.dhall
let enum = ./enum.dhall
let statuses = ./statuses.dhall
let moreRules = ./moreRules.dhall

let entityType = enum.EntityType.ORDER
let entitySubtype = "HD"

      -- Since id is nullable, we wrap it in a None type
in    { id = None Text
      -- String interpolation, yay!
      , name = "${enum.EntityType/Show entityType}::${entitySubtype}"
      , entityType
      , entitySubtype
      , description = "Tiny Dhall Workflow"
      , statuses
      , rulesets =
        [ schema.Ruleset::{
          , name = "CREATE"
          , description = "CREATE & Validate order's delivery address"
          , triggers = [ { status = enum.Status.CREATED } ]
          , rules =
            [ schema.ChangeState enum.Status.SHIPPING
            , schema.Rule::{
              , name = "${namespace}.SendEvent"
              , props = toMap { eventName = "ShipOrder" }
              }
            ]
          }
        ] # moreRules
      }
    : schema.Workflow
-- A relative import
let enum = ./enum.dhall
let namespace = "wtf.pv"

-- Props has dynamic keys/values, type as a list of key-value pairs
let Props : Type = List { mapKey : Text, mapValue : Text }

-- We can set default values using a record with a 'default' key,
-- note that we need to type the empty list as well
let Rule = { Type = { name : Text, props : Props }
           , default.props = [] : Props }

-- Util function to create a Rule record, serializing the Status to text
let ChangeState =
      \(status : enum.Status) ->
        Rule::{
        , name = "${namespace}.ChangeState"
        , props = toMap { status = enum.Status/Show status }
        }

let Ruleset =
      { Type =
          { name : Text
          , description : Text
          , type : enum.EntityType
          , rules : List Rule.Type
          , triggers : List { status : enum.Status }
          }
      , default =
        { type = enum.EntityType.ORDER
        , rules = [] : List Rule.Type
        , triggers = [] : List { status : enum.Status }
        }
      }

let Workflow
    : Type
    -- id is nullable, so wrap it in an optional
    = { id : Optional Text
      , name : Text
      , description : Text
      , entityType : enum.EntityType
      , entitySubtype : Text
      , rulesets : List Ruleset.Type
      , statuses :
          List
            { name : enum.Status
            , entityType : enum.EntityType
            , category : Text
            }
      }

in  { Workflow, Ruleset, Rule, ChangeState }

Pretty succinct, and of similar length to our Jsonnet version as well. For someone who learned a bit of Haskell in university, and has seen a line here or there in the years since, the learning curve was okay but clearly higher than Jsonnet (and I'm pretty sure I've made some errors). We do get the added confidence of types, and even though defining them does take some time it's not much more than the Jsonnet functions. The safety edge is clearly on the Dhall side.

Dhall also supports multiple output formats, each by calling a different executable (dhall-to-json, dhall-to-nix) which I prefer over having to define the output via function call in Jsonnet.

When it comes to the ecosystem and general development experience, again decent but not exciting. After some searching for syntax highlighting for Sublime Text I found it along with a language server. The LSP formatter kept stripping comments by default which was a bit annoying.

I also found the information on the main website much easier to follow than the documentation website. Errors are readable, and the output is, in general, nice as well. Here's an example of me forgetting to include the toMap function on props:

Error: Expression doesn't match annotation

{ props : - List …
          + { … : … } (a record type)
, …
}

27│               schema.Rule::{
28│               , name = "${account}.commonv2.ChangeStateGQL"
29│               , props = { status = "PENDING" }
30│               }

dhall/main.dhall:27:15

Finally, we come to the type system, it feels like a big gain but at the same time, I kept finding edges to be annoyed by. The key-value lists, very minor things like the < > brackets around the union type definitions, an easier way to print union types to text, and the backslash in a function definition.

I spent some time searching for a way to derive the types from the data, for example, to check that the event name we pass to sendEvent is part of the Ruleset names. In the end, with no real fault of Dhall's, I kept wishing for a type system more like TypeScript where the line between value and type is a bit blurrier. Yes, I should probably reframe my problem in a way that fits Dhall better, but oh it would be so nice to have a list of all the statuses and create a sum type out of them. If this is possible and I just missed it please let me know!

Overall, I found Dhall quite interesting and if you are coming from a type-heavy language I think you're going to have a good time, but it didn't spark joy for me in my admittedly (very) short time with it.

But, types as values? What would that give us?

Cue

Cue, similar to Jsonnet, was created by an ex-googler as a reaction to GCL. However, while Jsonnet focuses on improving GCL, Cue takes an entirely different approach based on logic programming and constraints.

In Cue, both the data and its schema, or types, are treated as data. The best guide to the language I found was CUE is an exciting configuration language and if my explanations make any kind of sense I give all the praise to it.

Let's define some statuses again:

// Define the possible values for a Status
#Status: "CREATED" | "PACKING" | "SHIPPING" | "COMPLETE"

// define that status must conform to the #Status definition
status: #Status
// define status to be "CREATED"
status: "CREATED"

Running Cue would output { "status": "CREATED" }. And here we see the first thing that makes Cue tick, we can keep adding definitions that constrain the values different ways. For example, we can validate the value of numbers and strings by adding more constraints to them:

// Port must be an integer in the range between 1024 and 32000
#Port: int & >=1024 & <32000
// Phone is a string with at least one digit
#Phone: string & =~ "[0-9]+"
// numbers is a list of objects with a phone and optional name
numbers: [...{phone: #Phone, name?: string}]

This feature becomes powerful to validate the structure of objects. Like Jsonnet, Cue is also a JSON superset, allowing incremental adoption without specifying everything up front (which would be the case for Dhall). Cue also provides cue import to convert existing JSON, YAML, Protobuf, and a few others to Cue files, which can then be extended with definitions.

Now we can look at the Workflow in Cue:

package workflow

_statuses: [
  {
    entityType: "ORDER"
    category:   "BOOKING"
    name:       "CREATED"
  },
]

#Workflow & {
  id:          null
  description: "Tiny Cue workflow"
  statuses:    _statuses
  rulesets: [
    {
      name:        "CREATE"
      description: "CREATE & Validate order's delivery address"
      type:        "ORDER"
      triggers: [{status: "CREATED"}]
      rules: [
        // It's a bit annoying to specify the deeply nested values for the props
        // compared to a function call
        #Rules.order.changeState & {props: status: "CREATED"},
        {
          name: "wtf.pv.SendEvent"
          props: eventName: "ValidateOrder"
        },
      ] + moreRules
    },
  ]
}
package workflow

_namespace: "wtf.pv."

#Status:     "CREATED" | "PACKING" | "SHIPPING" | "COMPLETE"
#EntityType: "ORDER" | "FULFILMENT"

#Rule: {
  // The beginning of the name must start with the namespace
  // by using string interpolation with \() inside of a regex
  name: =~"^\(_namespace).+"
  // Props has string keys, pointing to string values
  props: {[string]: string}
}

#Rules: {
  order: {
    // Combine definitions on a single line with &
    changeState: #Rule & {
      name: "wtf.pv.ChangeState"
      // Brackets can be omitted for nested objects with only one key,
      props: status: #Status
    }
  }
}

#RuleSet: {
  name:        string
  description: string
  // Adding a default value with *
  type: #EntityType | *"ORDER"
  rules: [...#Rule]
  triggers: [...{status: #Status}]
}

_entityType:    #EntityType | *"ORDER"
_entitySubtype: string | *"HD"

#Workflow: {
  id:            string | null
  name:          "\(_entityType)::\(_entitySubtype)"
  description:   string
  entityType:    _entityType
  entitySubtype: _entitySubtype
  rulesets: [...#RuleSet]
  statuses: [...{
    name:       #Status
    entityType: #EntityType
    category:   string
  }]
}

Overall, it looks similar to the previous examples. We see more nesting of the values, and instead of using let or local to declare variables, we use a prefix to make them hidden, as in _statuses.

We can also split our code over multiple files in Cue. But because the output needs to be independent of which order we add definitions, files are grouped into packages. Cue's package and module management is modeled after Golang, and currently even piggybacks off it as the core library is hosted as a go library

Documentation for the use cases is great and showcases the language well. And, for a deep understanding the Logic of Cue is comprehensive but is not the ideal place to learn the language. The tutorials, on the other hand, have room for improvement, as they focus too much on demonstrating language features rather than building a solid foundation for learners by building on top of each other. Instead, I would make each page longer and add more examples, so that I can more easily refer back to what I've learned.

One drawback of Cue to me, compared to the other two languages, is the lack of user-defined functions. In the other two, I used it for ChangeState, whereas in Cue specifying the status requires specifying the full path to the nested prop. There is a workaround, but the syntax is not ideal (although a shorthand notation is being considered).

changeState: {
  #status: #Status,
  {name: "\(_namespace).ChangeState", props: { status: #status }}
}

rules: [
  // No more need to specify deeply nested values,
  // but it's pretty (to me)
  changeState&{_,#status: "CREATED"}
]

But there are functions in the language, for example in order to trim whitespace from a string I could write:

import "strings"

// Removes leading and trailing whitespace
helloWorld: strings.Trim(" Hello World ")

These functions are implemented in Go, and they provide a pretty decent standard library. However, if you'd like to add your own you will need to first write them in Go, and then write a new program that imports the cue core and passes them in. It also took me a while to understand that they existed since the documentation is not on the main website but on the go pkg.

There are a lot of things to like about Cue, and it's the most interesting approach of the tree. The unification of types and values is clever, even if the times I mentioned "value lattice" to someone their eyes started to glaze over. There are a couple of other features I didn't mention but are also pretty interesting: the ability to use it as command runner, and embedding it in protobuf and go.

It also took me the longest to understand. Overall I'd rate the onboarding experience as below Jsonnet but a bit better than Dhall. For example, I mentioned some of the specifics around the docs before. Using Sublime Text I had a hard time finding syntax highlighting, but I did find a Language Server and the built-in tools for formatting and validation in the CLI work well. The errors are the most terse out of the three, here's me changing the status in the ChangeState rule to one that doesn't exist:

$ cue export main.cue
rulesets.0.rules.0.props.status: 4 errors in empty disjunction:
rulesets.0.rules.0.props.status: conflicting values "COMPLETE" and "ERROR":
    ./cue/main.cue:7:1
    ./cue/main.cue:21:5
    ./cue/main.cue:21:51
    ./cue/schema.cue:4:59
    ./cue/schema.cue:5:35
    ./cue/schema.cue:52:11
    ./cue/schema.cue:58:19
... more output

We've now taken a look at all the tools we have found to tame the JSON Beast I'm dealing with: Jsonnet with it's functions and ability for objects to refer to itself, Dhall with types adding our guardrails and finally Cue unifying values and types. It's time to wrap this up.

Summary

After having spent a couple of evenings with all three now, while I haven't used them in anger yet and don't count myself an expert I've gathered some impressions on how it feels to work with them. It's still quite surface level, but it was an useful exercise to me at least.

It's very interesting to see how the different languages are influenced by their environment. Jsonnet is very pragmatic, just get these templates together, no need to worry about types, and you've probably used JavaScript and Python so the primitives look like that. Dhall comes from Haskell, immediately visible in formatting, the let-in operators, and the working with types themselves. My initial impression of Cue was that it felt more academic, talking about logic and value lattices, but its Golang implementation sneaks in and adds a lot of pragmatism around packages, modules, and embedding it.

I think Jsonnet would be the easiest to introduce into an engineering organization, there's not a lot of clever stuff happening and with the growing ecosystem, there's a lot of help available. People fond of types or coming from a functional language would enjoy Dhall, but I would have a hard time recommending it in the organizations I've been part of, the friction goes up a bit too much for the added value.

Cue is really interesting, but the learning curve is quite high to get productive. I don't think this is necessarily a fault of the language itself though. With some polish on the CLI output, and some more love poured into the docs I would be much happier to try it in a team.

Recently I was using OPA Rego which has similar goals, but discussing a fourth language would make this long article even longer so I'm skipping it. While also aimed for validation, Rego worked quite well for querying data using its x contains y if {...} constructs, and I'm wondering what a language that would mix both of those features could look like.

In the end, to tame the JSON beast we are going to give Jsonnet a try. Our decision is driven by prioritizing ecosystem support and a gentle learning curve over the most technically impressive language. With Jsonnet, we hope to stand a fighting chance against the daunting JSON beast. Wish us luck!