Skip to content

Scripting API

The Scripting API provides read-only access to DATAMIMIC context data from custom Python scripts. Use it to create reusable functions that can be called from script= expressions.

Overview

Feature Description
Purpose Access context data from custom Python functions
Access Read-only (returns copies, not live references)
Scope Only works during expression evaluation

Quick Start

1. Create a Python Script

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
# my_functions.py
from datamimic_ee.scripting import load_memstore, get_variable

def get_customer_name(customer_id: str) -> str:
    """Look up customer name from memstore."""
    customers = load_memstore("customers")
    for customer in customers:
        if customer.get("id") == customer_id:
            return customer.get("name", "Unknown")
    return "Not Found"

def format_amount(multiplier: float = 1.0) -> str:
    """Format the current record's amount with currency."""
    amount = get_variable("amount", 0)
    currency = get_variable("currency", "USD")
    return f"{currency} {amount * multiplier:,.2f}"

2. Use in XML Model

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
<setup>
    <!-- Load your script -->
    <execute uri="my_functions.py"/>

    <!-- Populate memstore with customer data -->
    <memstore id="mem"/>
    <generate name="customers" source="data/customers.ent.csv" target="mem"/>

    <!-- Use your functions in expressions -->
    <generate name="invoices" source="data/invoices.ent.csv" target="JSON">
        <!-- Source columns are already in context; use <key> only to override/add fields -->
        <key name="customer_name" script="get_customer_name(customer_id)"/>
        <key name="formatted_total" script="format_amount(1.1)"/>
    </generate>
</setup>

API Reference

Import

1
2
3
4
5
6
7
8
9
from datamimic_ee.scripting import (
    load_memstore,
    load_memstore_page,
    get_variable,
    get_property,
    get_current_product,
    get_record_position,
    get_context_info,
)

Functions

load_memstore(product_name) -> list[dict]

Load all records from a memstore.

1
2
3
def lookup_product(product_id: str) -> dict:
    products = load_memstore("products")
    return next((p for p in products if p["id"] == product_id), {})

load_memstore_page(product_name, skip, limit) -> list[dict]

Load a page of records from memstore (for large datasets).

1
2
def get_first_10_customers() -> list[dict]:
    return load_memstore_page("customers", skip=0, limit=10)
  • skip: Number of records to skip (>= 0)
  • limit: Maximum records to return (> 0)

get_variable(name, default=None) -> Any

Get a variable value from the current record context.

1
2
3
def double_quantity() -> int:
    qty = get_variable("quantity", 0)
    return qty * 2

get_property(name, default=None) -> Any

Get a property from the root context (e.g., from .properties files).

1
2
def get_environment() -> str:
    return get_property("env", "development")

get_current_product() -> dict | None

Get a copy of the current product (all keys generated so far).

1
2
3
def summarize_record() -> str:
    product = get_current_product()
    return f"ID: {product.get('id')}, Name: {product.get('name')}"

get_record_position() -> int | None

Get the 1-based position of the current record.

1
2
def is_first_record() -> bool:
    return get_record_position() == 1

get_context_info() -> dict

Get a read-only snapshot of context metadata.

1
2
3
def log_context() -> str:
    info = get_context_info()
    return f"Task: {info['task_id']}, Record: {info['record_position']}"

Returns:

  • task_id: Current task identifier
  • descriptor_name: Name of the XML descriptor
  • record_position: Current record number (1-based)
  • current_name: Name of the current generate block

Important: Evaluation-Only

Call Inside Functions Only

The Scripting API only works during expression evaluation. Do not call these functions at module import time.

Correct:

1
2
3
def my_function():
    # Called during expression evaluation
    return load_memstore("data")  # Works

Incorrect:

1
2
# Called at import time - will error!
data = load_memstore("data")  # Error: I858

Common Patterns

Lookup Tables

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
from datamimic_ee.scripting import load_memstore

_cache = {}

def lookup_country_name(code: str) -> str:
    if "countries" not in _cache:
        _cache["countries"] = {
            c["code"]: c["name"]
            for c in load_memstore("countries")
        }
    return _cache["countries"].get(code, code)

Cache Timing

The cache is populated on first function call. Ensure the memstore is fully populated before calling lookup functions. In your XML model, place the <generate target="mem"> statement before any generate that uses the lookup.

Conditional Logic

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
from datamimic_ee.scripting import get_variable

def calculate_discount() -> float:
    amount = get_variable("amount", 0)
    customer_type = get_variable("customer_type", "regular")

    if customer_type == "premium":
        return amount * 0.20
    elif customer_type == "business":
        return amount * 0.15
    return amount * 0.05

Cross-Record Aggregation

1
2
3
4
5
6
7
8
9
from datamimic_ee.scripting import load_memstore

def total_order_value(customer_id: str) -> float:
    orders = load_memstore("orders")
    return sum(
        o.get("amount", 0)
        for o in orders
        if o.get("customer_id") == customer_id
    )

Dynamic Formatting

1
2
3
4
5
6
7
8
9
from datamimic_ee.scripting import get_variable, get_property

def format_currency() -> str:
    amount = get_variable("amount", 0)
    locale = get_property("locale", "en_US")

    if locale.startswith("de"):
        return f"{amount:,.2f} EUR".replace(",", "X").replace(".", ",").replace("X", ".")
    return f"${amount:,.2f}"

Best Practices

  1. Define functions, don't execute at import - API only works during evaluation
  2. Use default values - Handle missing variables gracefully
  3. Cache expensive lookups - Use module-level caches for repeated lookups
  4. Keep functions pure - Don't modify context; return computed values
  5. Document your functions - Help other team members understand usage

Error Codes

Code Description
I858 Scripting API called outside expression evaluation
I859 Invalid product_name (must be non-empty string)
I860 Invalid pagination (skip must be >= 0, limit must be > 0)

See Error Codes Reference for more details.