target Attribute¶

The target attribute within the generate and execute element in DATAMIMIC allows users to specify the desired output format for the generated data. This functionality provides flexibility in directing the output to different formats or environments, whether built-in options like CSV or custom configurations like databases.

Built-in Supported Targets¶

DATAMIMIC natively supports the following output targets:

Preview: Default value, data generated will always get output with a limited subset to our DATAMIMIC UI Preview.
LogExporter: Prints the data in the Log View. This exporter it mostly used for debug purpose, so it has a default max limit of 100 items. We recommend to use our other exporters such as CSV, TXT... if you were to handle larger dataset.
ConsoleExporter: Prints the generated data directly to the console, useful for quick debugging and verification.
mem: Outputs the data to an in-memory store, allowing for quick access and manipulation within the same environment.
CSV: Exports the generated data to a CSV file, which is ideal for handling tabular data and can be easily imported into spreadsheet applications. This file can be downloaded from our DATAMIMIC UI Artifact.
JSON: Outputs the data in JSON format, suitable for web applications and data interchange between systems. This file can be downloaded from our DATAMIMIC UI Artifact.
JSONSingle: Outputs the data in JSON format, but with each record in a single file. This file can be downloaded from our DATAMIMIC UI Artifact.
OpenSearchBulk: Exports the data in OpenSearch bulk format, which is useful for bulk indexing data into OpenSearch. This file can be downloaded from our DATAMIMIC UI Artifact.
TXT: Saves the data as a plain text file, suitable for simple text-based records or logs. This file can be downloaded from our DATAMIMIC UI Artifact.
XML: Exports the data to an XML file, useful for structured data storage and data exchange between different systems. This file can be downloaded from our DATAMIMIC UI Artifact.

Artifacts Overview — Download in Artifact View

Notes

The target attribute supports multiple values, allowing you to combine different output formats or environments. This can be particularly useful when you want to direct the same generated data to multiple locations simultaneously.

Example Usage with Built-in Targets¶

Here’s an example of how to use the <generate/> element with a built-in target:

<!-- This generates 10 CUSTOMER records  with id and saves them to a CSV, TXT files also print them to the log. -->
<setup>
  <generate name="CUSTOMER" count="10" target="CSV,LogExporter,TXT">
    <key name="id" generator="IncrementGenerator"/>
  </generate>
</setup>

Custom Environment Targets¶

DATAMIMIC also supports custom environments, allowing users to direct output to specific databases, message brokers, or other external systems. For example:

Databases: Write generated data directly into a database.
Kafka: Send generated data to a Kafka topic.
MongoDB: Insert generated data into a MongoDB collection.

To use a custom environment as a target, we must first create the Environment from our DATAMIMIC UI, and then specify it in the target attribute:

<!-- Defines a custom database environment. -->
<database id="testDB" system="test"/>

<!-- This generates 10 CUSTOMER records and writes them to the 'testDB' database. -->
<generate name="CUSTOMER" count="10" target="testDB"/>

Check out our Database Tutorial for a more detail example usage.

Example Usage and configuration of supported targets¶

CSV¶

<!-- This generates 10 CUSTOMER records with id and saves them to a CSV file. -->
<generate name="CUSTOMER" count="10" target="CSV">
  <key name="id" generator="IncrementGenerator"/>
</generate>

we also can specify the CSV file configuration such as delimiter, quotechar, quoting, encoding, line_terminator and chunk_size:

<!-- This generates 10 CUSTOMER records with id and saves them to a CSV file with a custom delimiter and quote character. -->
<generate name="CUSTOMER" count="10" target="CSV(chunk_size=1000, delimiter=';', quotechar='|')">
  <key name="id" generator="IncrementGenerator"/>
</generate>

JSON / JSONSingle¶

We can generate JSON files with multiple records in a single file or each record in a separate file by define the target as JSON or JSONSingle: JSONSingle is actually a special case of JSON, where each record is saved in a separate file. The records are not saved in an array, but as individual JSON files. The same is happening when we use the JSON(chunk_size=1) target.

<!-- This generates 10 CUSTOMER records with id and saves them to a JSON file stored in an array. -->
<generate name="CUSTOMER" count="10" target="JSON">
  <key name="id" generator="IncrementGenerator"/>
</generate>

We can also specify the JSON file configuration such as use_ndjson, encoding and chunk_size:

<!-- This generates 10 CUSTOMER records with id and saves them to a NDJSON file stored in a 1000er chunks with a custom encoding. -->
<generate name="CUSTOMER" count="10" target="JSON(chunk_size=1000, encoding='utf-16', use_ndjson=True)">
  <key name="id" generator="IncrementGenerator"/>
</generate>

OpenSearchBulk¶

The OpenSearchBulk target is used to generate data in the OpenSearch bulk format. This format is used to bulk index data into OpenSearch. The generated data is saved in a file with the extension .json or .ndjson and can be imported into OpenSearch using the bulk API.

The default configuration for the OpenSearchBulk target is as follows chunk_size=None, encoding='utf-8', use_ndjson=True: This means no chunking, utf-8 encoding, and using NDJSON format.

{ "index" : { "_index" : "test", "_id" : "1" } }
{ "field1" : "value1" }
{ "index" : { "_index" : "test", "_id" : "2" } }
{ "field1" : "value2" }

when we set use_ndjson=False, the generated data will be saved in a JSON format and stored in an array:

[
  { "index" : { "_index" : "test", "_id" : "1" } },
  { "field1" : "value1" },
  { "index" : { "_index" : "test", "_id" : "2" } },
  { "field1" : "value2" }
]

To use the OpenSearchBulk target, we can define the target as OpenSearchBulk and the model needs to specify the keys for the index, id, and routing:

$$_action$$: The action to perform on the document, such as index, create, or delete. $$_index$$: The index to which the document belongs. $$_id$$: The unique identifier for the document. $$routing$$: The routing value for the document.

Here is an example of how to use the OpenSearchBulk target:

    <generate name="special" source="script/template_xyz.json"
              sourceScripted="True"
              count="100000"
              cyclic="True"
              target="OpenSearchBulk(chunk_size=20000, use_ndjson=True)"
              pageSize="10000"
    >
        <variable name="randomNumberVar" generator="IncrementGenerator"/>
        <key name="$$_action$$" constant="index"/>
        <key name="$$_index$$" constant="movies"/>
        <key name="$$_id$$" generator="IncrementGenerator"/>
        <key name="$$routing$$" constant="12341243"/>
        <key name="title" constant="Prisoners"/>
        <key name="year" constant="2013"/>
    </generate>

here is the content of the script/template_xyz.json file:

[
  {
    "title": "Prisoners",
    "year": "2013"
  }
]

the output will be saved in a file with the extension .json and can be downloaded from our DATAMIMIC UI Artifact.

Preview of the generated data in the OpenSearchBulk format:

[
{ "index" : { "_index" : "movies", "_id" : "1", "routing" : "12341243" } },
{ "title" : "Prisoners", "year" : "2013" },
{ "index" : { "_index" : "movies", "_id" : "2", "routing" : "12341243" } },
{ "title" : "Prisoners", "year" : "2013" },
{ "index" : { "_index" : "movies", "_id" : "3", "routing" : "12341243" } },
{ "title" : "Prisoners", "year" : "2013" }
]