Diving into DataWeave
Feb. 4, 2018
I have been writing application programming interfaces (APIs) for over 20 years, and I have always found integrations enjoyable because they are like logic puzzles which I like to do on a regular basis. Integrating systems is kind of like building with lego blocks and you are free to choose from a variety of technologies, payload types, and tools. I am a big fan of Django and its REST framework which makes it very easy to quickly build and deploy complex REST interfaces, but when it comes to writing complex transformations it is hard to beat MuleSoft's DataWeave (DW) tool.
DataWeave is an elegant and lightweight expression language. The objective of this blog is to highlight the power of DataWeave using an example JSON payload that we will transform into an XML document. This post is not intended to be aDataWeave primer, but I am going to cover some basics to ensure readers have the minimum background required to get value out of this post. If you want a more in-depth introduction to DataWeave I recommend Nial Darbey's outstanding 4 part series located at Getting started with DataWeave: Part 1. You should also review MuleSoft's DataWeave reference documentation which provides a number of excellent code snippets with examples of useful functions such as pluck, zip, flatten, reduce, etc.
Mule DataWeave Data Types
One fundamental concept every Mule developer needs to be aware of is the fact that DataWeave only works with 3 data types; Simple Types, Arrays, and Ojects. This is true for all DW transformations regardless of the inbound message and the outbound message(s). Each of these is explained in a little more detail below.
-
Simple Types: Simple types are analogous to primitive types in many programming languages; strings, characters, integers, floats, etc. Examples of DW simple types include
'DataWeave is powerful', 2018, 208.50, 'c'
, etc. Although strings are simple types they can also be treated as an array since strings are just a sequence of characters. -
Arrays: An array is a collection of elements. Examples of arrays include number sequences such as
[1, 3, 5, 7, 9, 11] or strings like ["DataWeave", "is", "powerful"]
. It is important you keep mime types in mind when writing DataWeave transformations. Arrays will not be rendered in XML and you will need to use repeating keys. Conversely, repeating keys will not be rendered in JSON so you should generate arrays instead. -
Objects: Objects are complex data types. Anyone who has programmed with languages such as C++, Java, Python, etc. understands that objects are represented by classes. Below is an example of data that would be treated as a DW object type.
{
"alertId": "591f32e74cc1969c3fbf9323 ",
"type": "geofence",
"subType": "inside",
"level": "info",
"eventType": "start",
"eventTime": "2017-05-19T18:01:10.976Z",
"eventSource": "position",
"position": {
"_id": "591f37b34cc1969c3fbf9513",
"type": "Feature",
"geometry": {
"type": "Point",
"coordinates": [
-121.030426,
37.615482
]
}
}
}
Our DataWeave Example
We have a dealer, named Luxury Cars, who despite not being creative when it comes to names is integration savvy because they use MuleSoft tools for integration. Luxury Cars accesses a publicly available REST API that returns an array of vehicles that includes a number of attributes including manufacturer, style, model, msrp, images, description, and drive type with the latter two being optional (i.e. not all vehicles have the latter two available). The Luxury Cars IT team subscribes to the REST endpoint, but they need to transform it to an XML payload and add some additional tags for use by one of their internal systems. Further, the XML payload needs to be restricted so that only cars with an MSRP exceeding $100,000 is passed on...we are talking Luxury Cars after all! Below is the example we will walk through in detail.
DataWeave Transformation Explained
-
DataWeave Design Environment: The Anypoint Studio DataWeave editor has three panes. The left pane defines the the structure of the incoming message along with its type. DataWeave can operate on multiple data types including Java objects, JSON, XML, custom classes, and more so it is importnat to define the incoming data type. If an incoming mime type is not supplied then DataWeave will default to Java String. The middle pane shows the DataWeave expression that performs the actual transformation. The right pane shows the outgoing structure in its proper data type (in our case XML). A nice feature of the DW editor is the fact the outgoing payload constantly refreshes as changes are made to the transformation. Other useful features of the DW editor include the ability to perform multiple transformations of the same inbound message by assigning different output targets and the toggling of different views to hide the incoming payload or outgoing message if you prefer to focus on the transformation expression. However, there are a couple of "watch outs" when using DataWeave. The first is the fact that your expression either works or it doesn't. There is no way in the DW editor to "step" through the transformation and see where an error is being thrown if there is a problem in the payload. The other issue I have encountered is the DW editor failing to show an outgoing payload even when the inbound payload and DW expression are both valid. Fortunately, this happens very infrequently but it is something to be aware of.
-
Payload: The concept of a
payload
in MuleSoft is an important one. A Mule payload is the core of the message that contains a Java object that can be of any data type. A message's payload will mutate as it gets operated on by Mule processors. It is important to set the mime type prior to a DW transformation unless the Java String default is acceptable. In our example our input payload is a JavaScript Object Notation (JSON) format. -
Expression Pane: The expression pane itself is separated into a header which contains information such as outgoing mime type, variables, functions, etc. and the actual transformation itself. The different areas are separated using three dashes
---
. The only requirement for the header is to include%dw 1.0
and the outgoing data type. -
Header: Our header is relatively simple, but it illustrates several important concepts. We specify our output type to be XML and state the namespaces we will use in the actual transformation. We also declare a new data type of
currency
that formats the msrp attribute using the provided formatting mask. The "date" variable uses the current time stamp to show when the transformation was performed. Finally, we define a function called "words" which accepts a string called "name" and returns an array of strings split by a period. -
Context: Understanding how Mule manages the payload's context during a DW transformation is critical. We know map will loop through an iterable and DW keeps track of each iteration and sets the current context accordingly. You can reference the current index of the iterable using
$$
(starts at 0) and the current context using$
. We want to create a new XML attribute named cars and between that tag we reference input payload attributes (Manufacturer, Model, MSRP, etc.) using the $ notation. Pay close attention to the images key. Note how we map once again using$.Images
. We can do this because images are stored as an array and are thus iterable. DataWeave maintains a reference to the current context which allows us to loop through images. When the final iteration is complete the resulting context is the result of the dataweave transformation to that point. Since we want to filter out cars with an msrp less than $100,000 we addfilter $.car.msrp > 100000
after the enclosing{...}
of the car key. Keep in mind that the DW object referred to by$
is now the newly transformed payload and we need to filter out cars using$.car.msrp
because this is what was created by the transformation. -
Literal Expressions: A DataWeave object consists of key:value pairs where the key is a string without quotes and the value is one of the three DW types discussed above (simple, array, or object). In this example all keys are expressed literally, but it is possible to generate keys using an expression that returns a string. The inbound payload contains information about cars, but for the XML payload we want to add additional information including the date the XML payload was created, the name of our dealer (the very boring "Luxury Cars"), and a new car tag that will serve as the parent tag for the information returned in the JSON payload from the REST endpoint invoked by our Mule integration (i.e. the inbound payload). Note that
date
is both a key expression and a value; DW treats the key as a string and the value is a reference to the date variable defined in the header of our transformation. -
XML Namespaces: We have defined two XML namespaces,
mod
andmes
that we want to use in our XML payload generated by the DW expression. The namespaces are included in the header and subsequently referenced in the DW expression by appending a#
to the namespace we want to use and then prepending that to the key in the DW transformation responsible for generating the XML tags. In our examle themod#info
expression in the transformation produces<mod:info xmlns:mod="http://stormy.com/mod/1.1">
. -
Map: The
map
function in DW is similar to the map function in other languages such as Python and JavaScript; map takes an iterable an invokes a function for each element in the iterable. In this example the iterable is the payload and the function is the expression wrapped in the{ ... }
braces with the last one immediately followed byfilter $.car.msrp > 100000
. This example is a very simple example so we can invoke map directly on the payload. However, when using a value selector (not covered here) you need to add a*
when iterating through an array using map. For example, assume we had an array nested within a JSON object called payload.kbb.foreign.cars and we wanted to iterate through foreign cars. We would have to writepayload.kbb.foreign.*cars map...
. Also, note the code snippetusing (parts = words($))
that is part of the map function we use to iterate through images. Each time map iterates over the next object in the inbound message's array of images it invokes thewords
function passing in$
as the argument; in this case$
refers to the current object which is the image name contained in the Images array. Theparts
variable is part of the context when it returns, and it remains in scope until the iteration is complete. -
Parentheses
(...)
: Just like words can have more than one meaning, the use of parentheses may be used in more than one way within a DataWeave transformation. One of the attributes we want to include in our XML outbound payload is "drive," but not all cars include drive. That means we have to let DataWeave know that the drive XML element may or not be included in the XML document. We accomplish this by wrapping the "drive" key and the "Drive" attribute in parentheses:(drive : $.Drive) when $.Drive != null
. This expression informs DataWeave to include the XML drive tag when it is present in the current object (i.e. not null). The "Description" attribute is also not present in every object, but that expression is not enclosed with parentheses. The reason for this is because our DW expression always includes a "drive" XML tag; when the current car object does not have a description we default the description to "None". There is another important use of parentheses. DataWeave requires map expressions to be wrapped in(...)
when outputting XML payloads. You can see how each of our map expressions is enclosed with parentheses.
The example use case is a simple one yet it covers a wide array of powerful features and capabilities offered by DataWeave. You are well on your way to becoming proficient with DataWeave if you understand the concepts discussed in this post and the DataWeave expression discussed above. It is not unusual in enterprise-level application integration to have inbound messages that contain hundreds of lines of complex payloads of different types such as JSON, POJOs, XML, etc. and it is in these cases that DataWeave differentiates itself from other tools.