Skip to content

<operate>

The <operate> element provides powerful batch operations on templates for data transformation. It supports both XML (using XPath) and JSON (using JSONPath) templates with extensive operation capabilities.

Overview

The operate element reads operation definitions from a CSV file and applies them to templates, enabling: - Batch data transformations - Template modifications before data generation - Dynamic value injection using generators - Complex structural operations - Function execution

Attributes

Attribute Required Description Default
source Yes CSV file path containing operations and models -
operation_prefix No Prefix for operation column names in CSV op
template_not_found_action No Action when template is missing: warn, error, ignore warn
operation_not_matched_action No Action when operation is invalid: warn, error, ignore warn
template-dir No Base directory for template files -

CSV File Format

The CSV file must contain: - id: Unique identifier for each artifact - template: Path to the template file (relative to template-dir if specified) - Operation columns: Named with the operation_prefix (e.g., op1, op2, op3)

Operation Types

Basic Operations

set

Updates or creates a value at the specified path.

XML Example:

1
2
id|template|op1|op2
user001|user.xml|set(//user/name,John Doe)|set(//user/age,30)

JSON Example:

1
2
id|template|op1|op2
user001|user.json|set($.user.name,John Doe)|set($.user.age,30)

delete

Removes a node/property at the specified path.

XML Example:

1
2
id|template|op1
user001|user.xml|delete(//user/email)

JSON Example:

1
2
id|template|op1
user001|user.json|delete($.user.email)

Advanced Structural Operations

XML-Specific Operations

  • replaceChildren: Replace all children of a node

    1
    2
    op1
    replaceChildren(//root/data,<item>New Item</item>)
    

  • appendChild: Add a child to a node

    1
    2
    op1
    appendChild(//root/metadata,<updated>2025-07-17</updated>)
    

  • upsertNode: Update or insert a node

    1
    2
    op1
    upsertNode(//root/status,active)
    

  • mergeAttributes: Merge attributes into a node

    1
    2
    op1
    mergeAttributes(//root,id="test-123" enhanced="true")
    

  • clearChildren: Remove all children from a node

    1
    2
    op1
    clearChildren(//root/temp)
    

  • upsertAttribute: Update or insert an attribute

    1
    2
    op1
    upsertAttribute(//root,version="2.0")
    

JSON-Specific Operations

  • mergeObject: Merge objects

    1
    2
    op1
    mergeObject($.config,{"timeout": 30, "ssl": true})
    

  • upsertProperty: Update or insert a property

    1
    2
    op1
    upsertProperty($.metadata.version,2.0)
    

  • replaceArray: Replace an entire array

    1
    2
    op1
    replaceArray($.items,[{"id": 3, "name": "New Item"}])
    

  • appendItem: Add item to array

    1
    2
    op1
    appendItem($.tags,new_tag)
    

  • clearArray: Empty an array

    1
    2
    op1
    clearArray($.temp)
    

Function Operations

Execute functions without modifying content:

1
2
id|template|op1|op2|op3
func_001|dummy.txt|function(addNode)|function(validateSchema)|function(generateId)

Dynamic Value Injection

Use generators and variables within operations:

1
2
id|template|op1|op2|op3
gen001|user.xml|set(//user/name,{GivenNameGenerator})|set(//user/id,{IncrementGenerator})|set(//user/email,{EmailAddressGenerator})

Complex Examples

Example 1: Basic XML Operations

Setup (datamimic.xml):

1
2
3
<setup defaultSeparator="|">
    <operate source="data/operations.opctl.csv" operation_prefix="op"/>
</setup>

CSV (operations.opctl.csv):

1
2
3
id|template|op1|op2|op3
user001|templates/user.xml|set(//user/name,John Doe)|set(//user/age,30)|delete(//user/temp)
company001|templates/company.xml|set(//company/name,TechCorp)|set(//company/employees,150)|upsertAttribute(//company,status="active")

Example 2: JSON with Complex Paths

1
2
3
id|template|op1|op2|op3
json001|config.json|set($.users[0].name,Alice)|set($.users[*].active,true)|delete($.deprecated)
json002|config.json|set($..id,GLOBAL-ID)|set($.users[?(@.active)].verified,true)|mergeObject($.settings,{"theme":"dark"})

Example 3: Healthcare Business Domain

Setup:

1
2
3
4
5
6
7
8
9
<setup>
    <execute uri="scripts/healthcare_functions.scr.py"/>
    <variable name="patient" entity="Person(dataset='US', min_age=0, max_age=95)"/>
    <variable name="patientId" script="generate_patient_id()"/>

    <operate source="data/patient_records.opctl.csv" 
            template-dir="scripts/" 
            operation-prefix="op"/>
</setup>

CSV:

1
2
id|template|op1|op2|op3|op4
patient_001|patient_record.xml|set(//patient/id,{patientId})|set(//patient/name,{patient.given_name} {patient.family_name})|set(//patient/bloodType,{bloodType})|set(//diagnosis/code,{diagnosis.code})

Example 4: Error Handling

1
2
3
4
5
6
<setup>
    <operate source="data/operations.csv" 
             operation_prefix="op"
             template_not_found_action="error"
             operation_not_matched_action="warn"/>
</setup>

Best Practices

  1. Operation Order: Plan operations carefully as they are applied sequentially
  2. Avoid modifying nodes after deletion
  3. Create parent nodes before children

  4. Path Validation: Ensure paths exist before operations

  5. Use upsert operations for uncertain paths
  6. Handle missing paths gracefully with error actions

  7. Performance: For large datasets

  8. Group related operations together
  9. Use batch operations when possible
  10. Consider template complexity

  11. Error Handling:

  12. Set appropriate template_not_found_action based on requirements
  13. Use warn for development, error for production
  14. Monitor logs for operation warnings

  15. Maintainability:

  16. Use descriptive IDs in CSV files
  17. Document complex operations
  18. Keep templates organized in logical directories

Logical Validation

The operate system detects and reports invalid operation sequences: - Operations on deleted nodes - Conflicting modifications - Invalid paths after structural changes

Integration with Generators

Operate seamlessly integrates with DATAMIMIC generators: - Use generator expressions in set operations - Combine with variables for dynamic content - Support for all DATAMIMIC generator types

Examples by Use Case

E-commerce Product Catalog

1
2
id|template|op1|op2|op3
prod001|product.xml|set(//product/name,{ProductNameGenerator})|set(//product/price,{RandomDoubleGenerator(10,1000)})|set(//product/category,Electronics)

Financial Services

1
2
id|template|op1|op2|op3
loan001|loan_app.xml|set(//loan/applicant,{PersonGenerator})|set(//loan/amount,{RandomIntegerGenerator(1000,100000)})|set(//loan/status,pending)

Government Permits

1
2
id|template|op1|op2
permit001|permit.xml|set(//permit/id,PERMIT-{IncrementGenerator})|set(//permit/issueDate,{DateTimeGenerator})