Transformation Engine
The Transformation Engine is a powerful orchestration tool that can be used to ingest, normalize, and transform data in [R]DP, as well as sink it out to external systems. If you want to leverage [R]DP’s data ecosystem, seamlessly integrate with WDM, or enable agentic workflows with [R]AIMS, then the Transformer Engine provides the lowest barrier to entry.
Overview
[R]DP’s Transformation Engine was built to answer these questions:
-
How can users create powerful ETL pipelines leveraging [R]DP’s storage layer and data catalog?
-
How can developers have a clear picture of what data transformations are happening at any given moment?
-
How can [R]DP reduce the operational burden of building, deploying, and maintaining complex data pipelines?
-
How can users interleave [R]DP and [R]AIMS to create data-centric AI pipelines?
To that end, the Transformation Engine exposes two primary interfaces:
-
The drag-and-drop Canvas UI, where users with access to [R]DP can build and view pipelines in an intuitive way.
-
A set of REST APIs to enable developers and external systems to access [R]DP’s powerful tooling.
This guide will focus on the second, developer-centric interface.
Concepts
The Transformation Engine exposes five main concepts that encapsulate the developer experience:
-
Source — A process that brings data inside [R]DP.
-
Transformer — A process that operates on data and produces at least one output.
-
Sink — A process that sinks data to a final destination, either within [R]DP or external to it.
-
Dataset — An intermediate or final data product created as the output of a Source or Transformer.
A directed graph of these four concepts is called a Pipeline, which is the largest unit the Transformation Engine tracks.
As an example, the screenshot above shows a visualization of all five components:
-
TAK Consumeris a Source that ingests data from an upstream TAK server. -
Ingested TAK MessagesandCoT WDM Messagesare Datasets that store intermediate results for downstream processing. -
CoT XML to WDM Protobufis a Transformer that takes CoT XML messages and transforms them into WDM messages. -
Publish to WDMis a Sink that pushes data into the [R]DP WDM storage layer. -
Publish to UDLis a Sink that pushes data into the Unified Data Library. -
Publish to MSSis a Sink that pushes data into Maven Smart Systems. -
The connected graph of these components together forms a Pipeline.
Getting Started
Go to the Source, Transformer, Sink, or Dataset pages to learn more about the individual components of a Pipeline. Alternatively, use the Quick Start Guide to start using the [R]DP SDK to build and manage Pipelines.