Scripting API
The Scripting API provides read-only access to DATAMIMIC context data from custom Python scripts. Use it to create reusable functions that can be called from script= expressions.
Overview
| Feature |
Description |
| Purpose |
Access context data from custom Python functions |
| Access |
Read-only (returns copies, not live references) |
| Scope |
Only works during expression evaluation |
Quick Start
1. Create a Python Script
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16 | # my_functions.py
from datamimic_ee.scripting import load_memstore, get_variable
def get_customer_name(customer_id: str) -> str:
"""Look up customer name from memstore."""
customers = load_memstore("customers")
for customer in customers:
if customer.get("id") == customer_id:
return customer.get("name", "Unknown")
return "Not Found"
def format_amount(multiplier: float = 1.0) -> str:
"""Format the current record's amount with currency."""
amount = get_variable("amount", 0)
currency = get_variable("currency", "USD")
return f"{currency} {amount * multiplier:,.2f}"
|
2. Use in XML Model
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15 | <setup>
<!-- Load your script -->
<execute uri="my_functions.py"/>
<!-- Populate memstore with customer data -->
<memstore id="mem"/>
<generate name="customers" source="data/customers.ent.csv" target="mem"/>
<!-- Use your functions in expressions -->
<generate name="invoices" source="data/invoices.ent.csv" target="JSON">
<!-- Source columns are already in context; use <key> only to override/add fields -->
<key name="customer_name" script="get_customer_name(customer_id)"/>
<key name="formatted_total" script="format_amount(1.1)"/>
</generate>
</setup>
|
API Reference
Import
| from datamimic_ee.scripting import (
load_memstore,
load_memstore_page,
get_variable,
get_property,
get_current_product,
get_record_position,
get_context_info,
)
|
Functions
load_memstore(product_name) -> list[dict]
Load all records from a memstore.
| def lookup_product(product_id: str) -> dict:
products = load_memstore("products")
return next((p for p in products if p["id"] == product_id), {})
|
load_memstore_page(product_name, skip, limit) -> list[dict]
Load a page of records from memstore (for large datasets).
| def get_first_10_customers() -> list[dict]:
return load_memstore_page("customers", skip=0, limit=10)
|
skip: Number of records to skip (>= 0)
limit: Maximum records to return (> 0)
get_variable(name, default=None) -> Any
Get a variable value from the current record context.
| def double_quantity() -> int:
qty = get_variable("quantity", 0)
return qty * 2
|
get_property(name, default=None) -> Any
Get a property from the root context (e.g., from .properties files).
| def get_environment() -> str:
return get_property("env", "development")
|
get_current_product() -> dict | None
Get a copy of the current product (all keys generated so far).
| def summarize_record() -> str:
product = get_current_product()
return f"ID: {product.get('id')}, Name: {product.get('name')}"
|
get_record_position() -> int | None
Get the 1-based position of the current record.
| def is_first_record() -> bool:
return get_record_position() == 1
|
get_context_info() -> dict
Get a read-only snapshot of context metadata.
| def log_context() -> str:
info = get_context_info()
return f"Task: {info['task_id']}, Record: {info['record_position']}"
|
Returns:
task_id: Current task identifier
descriptor_name: Name of the XML descriptor
record_position: Current record number (1-based)
current_name: Name of the current generate block
Important: Evaluation-Only
Call Inside Functions Only
The Scripting API only works during expression evaluation. Do not call these functions at module import time.
Correct:
| def my_function():
# Called during expression evaluation
return load_memstore("data") # Works
|
Incorrect:
| # Called at import time - will error!
data = load_memstore("data") # Error: I858
|
Common Patterns
Lookup Tables
| from datamimic_ee.scripting import load_memstore
_cache = {}
def lookup_country_name(code: str) -> str:
if "countries" not in _cache:
_cache["countries"] = {
c["code"]: c["name"]
for c in load_memstore("countries")
}
return _cache["countries"].get(code, code)
|
Cache Timing
The cache is populated on first function call. Ensure the memstore is fully populated before calling lookup functions. In your XML model, place the <generate target="mem"> statement before any generate that uses the lookup.
Conditional Logic
| from datamimic_ee.scripting import get_variable
def calculate_discount() -> float:
amount = get_variable("amount", 0)
customer_type = get_variable("customer_type", "regular")
if customer_type == "premium":
return amount * 0.20
elif customer_type == "business":
return amount * 0.15
return amount * 0.05
|
Cross-Record Aggregation
| from datamimic_ee.scripting import load_memstore
def total_order_value(customer_id: str) -> float:
orders = load_memstore("orders")
return sum(
o.get("amount", 0)
for o in orders
if o.get("customer_id") == customer_id
)
|
| from datamimic_ee.scripting import get_variable, get_property
def format_currency() -> str:
amount = get_variable("amount", 0)
locale = get_property("locale", "en_US")
if locale.startswith("de"):
return f"{amount:,.2f} EUR".replace(",", "X").replace(".", ",").replace("X", ".")
return f"${amount:,.2f}"
|
Best Practices
- Define functions, don't execute at import - API only works during evaluation
- Use default values - Handle missing variables gracefully
- Cache expensive lookups - Use module-level caches for repeated lookups
- Keep functions pure - Don't modify context; return computed values
- Document your functions - Help other team members understand usage
Error Codes
| Code |
Description |
| I858 |
Scripting API called outside expression evaluation |
| I859 |
Invalid product_name (must be non-empty string) |
| I860 |
Invalid pagination (skip must be >= 0, limit must be > 0) |
See Error Codes Reference for more details.