Scripting API¶

The Scripting API provides read-only access to DATAMIMIC context data from custom Python scripts. Use it to create reusable functions that can be called from script= expressions.

Overview¶

Feature	Description
Purpose	Access context data from custom Python functions
Access	Read-only (returns copies, not live references)
Scope	Only works during expression evaluation

Quick Start¶

1. Create a Python Script¶

# my_functions.py
from datamimic_ee.scripting import load_memstore, get_variable

def get_customer_name(customer_id: str) -> str:
    """Look up customer name from memstore."""
    customers = load_memstore("customers")
    for customer in customers:
        if customer.get("id") == customer_id:
            return customer.get("name", "Unknown")
    return "Not Found"

def format_amount(multiplier: float = 1.0) -> str:
    """Format the current record's amount with currency."""
    amount = get_variable("amount", 0)
    currency = get_variable("currency", "USD")
    return f"{currency} {amount * multiplier:,.2f}"

2. Use in XML Model¶

<setup>
    <!-- Load your script -->
    <execute uri="my_functions.py"/>

    <!-- Populate memstore with customer data -->
    <memstore id="mem"/>
    <generate name="customers" source="data/customers.ent.csv" target="mem"/>

    <!-- Use your functions in expressions -->
    <generate name="invoices" source="data/invoices.ent.csv" target="JSON">
        <!-- Source columns are already in context; use <key> only to override/add fields -->
        <key name="customer_name" script="get_customer_name(customer_id)"/>
        <key name="formatted_total" script="format_amount(1.1)"/>
    </generate>
</setup>

API Reference¶

Import¶

from datamimic_ee.scripting import (
    load_memstore,
    load_memstore_page,
    get_variable,
    get_property,
    get_current_product,
    get_record_position,
    get_context_info,
)

Functions¶

`load_memstore(product_name) -> list[dict]`¶

Load all records from a memstore.

def lookup_product(product_id: str) -> dict:
    products = load_memstore("products")
    return next((p for p in products if p["id"] == product_id), {})

`load_memstore_page(product_name, skip, limit) -> list[dict]`¶

Load a page of records from memstore (for large datasets).

def get_first_10_customers() -> list[dict]:
    return load_memstore_page("customers", skip=0, limit=10)

skip: Number of records to skip (>= 0)
limit: Maximum records to return (> 0)

`get_variable(name, default=None) -> Any`¶

Get a variable value from the current record context.

def double_quantity() -> int:
    qty = get_variable("quantity", 0)
    return qty * 2

`get_property(name, default=None) -> Any`¶

Get a property from the root context (e.g., from .properties files).

def get_environment() -> str:
    return get_property("env", "development")

`get_current_product() -> dict | None`¶

Get a copy of the current product (all keys generated so far).

def summarize_record() -> str:
    product = get_current_product()
    return f"ID: {product.get('id')}, Name: {product.get('name')}"

`get_record_position() -> int | None`¶

Get the 1-based position of the current record.

def is_first_record() -> bool:
    return get_record_position() == 1

`get_context_info() -> dict`¶

Get a read-only snapshot of context metadata.

def log_context() -> str:
    info = get_context_info()
    return f"Task: {info['task_id']}, Record: {info['record_position']}"

Returns:

task_id: Current task identifier
descriptor_name: Name of the XML descriptor
record_position: Current record number (1-based)
current_name: Name of the current generate block

Important: Evaluation-Only¶

Call Inside Functions Only

The Scripting API only works during expression evaluation. Do not call these functions at module import time.

Correct:

def my_function():
    # Called during expression evaluation
    return load_memstore("data")  # Works

Incorrect:

# Called at import time - will error!
data = load_memstore("data")  # Error: I858

Common Patterns¶

Lookup Tables¶

from datamimic_ee.scripting import load_memstore

_cache = {}

def lookup_country_name(code: str) -> str:
    if "countries" not in _cache:
        _cache["countries"] = {
            c["code"]: c["name"]
            for c in load_memstore("countries")
        }
    return _cache["countries"].get(code, code)

Cache Timing

The cache is populated on first function call. Ensure the memstore is fully populated before calling lookup functions. In your XML model, place the <generate target="mem"> statement before any generate that uses the lookup.

Conditional Logic¶

from datamimic_ee.scripting import get_variable

def calculate_discount() -> float:
    amount = get_variable("amount", 0)
    customer_type = get_variable("customer_type", "regular")

    if customer_type == "premium":
        return amount * 0.20
    elif customer_type == "business":
        return amount * 0.15
    return amount * 0.05

Cross-Record Aggregation¶

from datamimic_ee.scripting import load_memstore

def total_order_value(customer_id: str) -> float:
    orders = load_memstore("orders")
    return sum(
        o.get("amount", 0)
        for o in orders
        if o.get("customer_id") == customer_id
    )

Dynamic Formatting¶

from datamimic_ee.scripting import get_variable, get_property

def format_currency() -> str:
    amount = get_variable("amount", 0)
    locale = get_property("locale", "en_US")

    if locale.startswith("de"):
        return f"{amount:,.2f} EUR".replace(",", "X").replace(".", ",").replace("X", ".")
    return f"${amount:,.2f}"

Best Practices¶

Define functions, don't execute at import - API only works during evaluation
Use default values - Handle missing variables gracefully
Cache expensive lookups - Use module-level caches for repeated lookups
Keep functions pure - Don't modify context; return computed values
Document your functions - Help other team members understand usage

Error Codes¶

Code	Description
I858	Scripting API called outside expression evaluation
I859	Invalid product_name (must be non-empty string)
I860	Invalid pagination (skip must be >= 0, limit must be > 0)

See Error Codes Reference for more details.