Skip to content

Domain Generators

DATAMIMIC domains are a vehicle for defining, bundling and reusing domain specific data generation, e.g. for personal data, addresses, internet, banking, telecom. They may be localized to specific languages and be grouped to hierarchical datasets, e.g. for continents, countries and regions.

DATAMIMIC includes several domains that have simple implementation of specific data generation. If you need further domains, we highly appreciate your feedback and contributions.

The following domains are included:

  • person: Data related to a person

  • address: Data related to contacting a person by post

  • organization: Organization data

  • finance: Finance data

  • net: Internet and network related data

  • product: Product-related data

  • br and us: Country specific data

Additionally, DATAMIMIC includes an easy way to utilize the FAKER library document below for additional datasets.

The person domain has three major components:

  • Person: Generates Person entities

  • AcademicTitleGenerator: Generates academic titles The generator can be config with academic_title_quota.

  • NobilityTitleGenerator: Generates nobility title The generator can be config with noble_quota.

  • GivenNameGenerator: Generates given names

  • FamilyNameGenerator: Generates family names

  • BirthDateGenerator: Generates birth dates

  • GenderGenerator: Generates Gender values. The generated gender can be one of the values MALE, FEMALE, OTHER. The generator is configured with the property female_quota, other_gender_quota. female_quota is highest priority, then other_gender_quota.

  • EmailAddressGenerator: Generates Email addresses

Person Entity

Creates Person entities to be used for prototype-based data generation. It can be configured with dataset and locale property. The generated Person Entity exhibits the properties salutation, title, given_name, family_name (four fields dataset-dependent), gender, birthdate, age, email. If the chosen dataset definition provides name weights, DATAMIMIC generates person names according to their statistical probability. Of course, gender, salutation and given_name are consistent.

You can use the Person entity like this:

1
2
3
4
5
<generate name="user" count="5" target="CSV">
  <variable name="person" entity="Person(min_age=20, max_age=45, female_quota=0.5)" dataset="FR"/>
  <key name="salutation" script="person.salutation"/>
  <key name="name" script="f'{person.given_name} {person.family_name}'"/>
</generate>

to get output similar to this:

1
2
3
4
5
6
salutation|name
Mme|Claude Bernard
Mme|Jeannine Lefebvre
M.|Robert Bernard
M.|Roger Morel
Mme|Dominique Dubois

The Person entity has the following data fields:

property name type property description
salutation String Salutation (e.g. Mr/Mrs)
academic_title String Academic title (e.g. Dr)
name String Name (first name + last name)
given_name String Given name ('first name' in western countries)
family_name String Family name ('surname' in western countries)
gender Gender Gender (male, female or other)
birthdate Date Birth date
age Integer actual age
phone String phone number in text
email String email address
nobility_title String Noble title (e.g. Baron/Baroness)

Person Entity Properties

The Person Entity can be configured with several properties:

Property Description Default Value
dataset Either a region name or the two-letter-ISO-code of a country, e.g. US for the USA. The user's default country
min_age The minimum age of generated persons 15
max_age The maximum age of generated persons 105
female_quota The quota of generated women (1 → 100%) 0.49
other_gender_quota The quota of generated other gender (1 → 100%) 0.02
noble_quota The rate of generated noble title (1 → 100%) 0.001
academic_title_quota The rate of generated academic title (1 → 100%) 0.5

Supported countries

country code remarks
Austria AT most common 120 given names with absolute weight, most common 40 family names with absolute weight
Australia AU most common 40 given names (unweighted), most common 20 family names with absolute weight
Belgium BE most common 38 given names (unweighted), most common 15 family names with absolute weight
Brazil BR most common 100 given names (unweighted), most common 29 family names (unweighted)
Canada CA most common 80 given names (unweighted), most common 20 family names (unweighted). No coupling between given name locale and family name locale
Switzerland CH most common 30 given names with absolute weight, most common 20 family names with absolute weight
China CN Chinese letters. Most common 46 given names (unweighted), most common 106 family names with absolute weight
Czech Republic CZ most common 20 given names with absolute weight, most common 20 family names with absolute weight. Female surnames are supported.
Germany DE most common 1998 given names with absolute weight, most common 3421 family names with absolute weight
Spain ES most common 40 given names (unweighted), most common 40 family names with absolute weight
Finland FI most common 785 given names (unweighted), most common 448 family names (unweighted)
France FR most common 100 given names (unweighted), most common 30 family names with relative weight
Ireland IE most common 41 given names (unweighted), most common 26 family names (unweighted)
Israel IL 264 given names (unweighted), most common 30 family names with relative weight
India IN most common 155 given names (unweighted), most common 50 family names (unweighted)
Italy IT most common 60 given names (unweighted), most common 20 family names (unweighted)
Japan JP Kanji letters. Most common 109 given names (unweighted), most common 50 family names with absolute weight
Republic of Korea KR Hangul letters. Most common 91 given names (unweighted), most common 182 family names with absolute weigh
Netherlands NL 3228 given names (unweighted), most common 10 family names with absolute weight
Norway NO most common 300 given names (unweighted), most common 100 family names with absolute weight
New Zealand NZ most common 20 given names (unweighted), most common 8 family names (unweighted)
Poland PL most common 67 given names with absolute weight, most common 20,000 family names with absolute weight. Female surnames are supported.
Russia RU Cyrillic letters. Most common 33 given names with relative weight, most common 20 family names with relative weight. Female surnames are supported.
Sweden SE 779 given names (unweighted), most common 22 family names with relative weight
Slovenia SI most common 400 given names with relative weight, most common 200 family names with relative weight
Slovakia SK most common 20 given names with relative weight, most common 22 family names with relative weight
Turkey TR 1077 given names (unweighted), 37 family names (unweighted)
Ukraine UA most common 48 given (unweighted), most common 20 family names (unweighted)
United Kingdom GB most common 20 given (unweighted), most common 25 family names (unweighted)
USA US most common 600 given names and most common 1000 family names both with absolute weight

Address Generators

  • Address Entity: Generates addresses that match simple validity checks: The City exists, the ZIP code matches and the phone number area codes are right. The street names are random, so most addresses will not stand validation of real existence.

  • Country Entity: Generates countries

  • City Entity: Generates Cities for a given country

  • PhoneNumberGenerator: Generates landline telephone numbers for a country

  • StreetNameGenerator: Generates street names for a given country

Address Entity

You can use the Address entity like this:

1
2
3
4
5
<generate name="data" count="5" target="CSV">
  <variable name="address" entity="Address" dataset="FR"/>
  <key name="street" script="address.street"/>
  <key name="home" script="f'{address.house_number} {address.street}, {address.city}'"/>
</generate>

to get output similar to this:

1
2
3
4
5
6
street|home
place de l'Eglise|7463 place de l'Eglise, Le Mans
rue du moulin|2695 rue du moulin, Champdieu
rue des écoles|1524 rue des écoles, Nibelle
rue du stade|149 rue du stade, Ruan
rue de la gare|4704 rue de la gare, Châteauroux

The generated Address entities have the following data fields:

Property Name Type Property Description
street String The regular street address
house_number String The house number associated with the street address
postal_code(zip_code) String The postal or ZIP code
city String The name of the city
state String The state or region
country String The country
country_code String Two-letter country codes, equal to dataset
phone String Home phone number
mobile_phone String Mobile phone number
fax String Fax number
organization String The associated organization

City Entity

You can use the City entity like this:

1
2
3
4
5
<generate name="data" count="5" target="CSV">
  <variable name="city" entity="City" dataset="FR"/>
  <key name="name" script="city.name"/>
  <key name="state" script="f'{city.state}, {city.country}'"/>
</generate>

to get output similar to this:

1
2
3
4
5
6
name|state
Sauvigny-le-Beuréal|Bourgogne-Franche-Comté, France
Pau|Nouvelle-Aquitaine, France
Beaumont-de-Lomagne|Occitanie, France
Marat|Auvergne-Rhône-Alpes, France
Laifour|Grand Est, France

The generated City entities have the following data fields:

Property Name Type Property Description
name String The name of the city
name_extension String Additional name or descriptor for the city
state String The state or region where the city is located
country String The country where the city is located
area_code String The telephone area code for the city
language String The primary language spoken in the city
population Integer The population count of the city
postal_code String The postal code of the city
country_code String The country code of the city

Country Entity

You can use the Country entity like this:

1
2
3
4
5
<generate name="data" count="5" target="CSV">
  <variable name="country" entity="Country"/>
  <key name="name" script="country.name"/>
  <key name="language" script="country.default_language_locale"/>
</generate>

to get output similar to this:

1
2
3
4
5
6
name|language
Ireland|en_IE
Russian Federation|ru_RU
Belgium|fr_BE
New Zealand|en_NZ
Venezuela|es_VE

The generated Country entities have the following data fields:

Property Name Type Property Description
iso_code String The ISO code representing the country
name String The official name of the country
default_language_locale String The default language locale used in the country
phone_code String The international phone code for the country
population Integer The population count of the country

Supported countries

The following countries are supported for this domain:

country code remarks
USA US Valid ZIP codes and area codes, no assurance that the street exists in this city.
United Kingdom GB Valid area codes, no postcodes, no assurance that the street exists in this city or the local phone number has the appropriate length. Contributions are welcome
Germany DE Valid ZIP codes and area codes, no assurance that the street exists in this city or the local phone number has the appropriate length
Switzerland CH Valid ZIP codes and area codes, no assurance that the street exists in this city or the local phone number has the appropriate length
Brazil BR Valid ZIP codes and area codes, no assurance that the street exists in this city or the local phone number has the appropriate length

Update:

We now support more country: AD, AL, AT, AU, BA, BE, BG, CA, CY, CA,
DK, EE, ES, FI, FR, GR, HR, HU, EI, IS, IT,
LI, LT, LU, LV, MC, NL, NO, NZ, PL, PT, RO,
RU, SE, SI, SK, SM, TH, TR, UA, VA, VE, VN

(Noted that some countries are missing postcodes, also no assurance that the street exists in this city or the local phone number has the appropriate length. Contributions are welcome.)

Net

The net domain provides the

  • DomainGenerator, which generates Internet domain names
1
<key name="domain" generator="DomainGenerator"/>

Organization

Provides the Company Entity along with the following generators:

  • CompanyNameGenerator, a generator for company names.

  • DepartmentNameGenerator, a generator for department names

If you use the CompanyNameGenerator like this:

1
2
3
4
<generate name="company" count="5" target="CSV">
    <key name="name" generator="CompanyNameGenerator" />
    <key name="department" generator="DepartmentNameGenerator" />
</generate>
you get output like this:

1
2
3
4
5
6
name|department
Hotology|Legal
ClickBot|Legal
WireForge|Logistics
TradeSoft|Logistics
GigaSpace|Sales

Company names can be generated for the following countries:

country code remarks
Germany DE none
USA US none

Company Entity

The generated Company entities have the following data fields:

Property Name Type Property Description
city String The city where the company is located
country String The country where the company is located
country_code String The ISO country code
email String The company's email address
fax String The company's fax number
full_name String The full legal name of the company
house_number String The house number associated with the company's address
id String A unique identifier for the company
sector String The sector or industry in which the company operates
short_name String The short or common name of the company
office_phone String The company's office phone number
zip_code String The postal or ZIP code
state String The state or region where the company is located
street String The street address of the company
url String The company's website URL
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
<generate name="company_list" count="10" target="CSV">
  <variable name="company" entity="Company" />
  <key name="city" script="company.city" />
  <key name="country" script="company.country" />
  <key name="country_code" script="company.country_code" />
  <key name="email" script="company.email" />
  <key name="fax" script="company.fax" />
  <key name="full_name" script="company.full_name" />
  <key name="house_number" script="company.house_number" />
  <key name="id" script="company.id" />
  <key name="sector" script="company.sector" />
  <key name="short_name" script="company.short_name" />
  <key name="office_phone" script="company.office_phone" />
  <key name="zip_code" script="company.zip_code" />
  <key name="state" script="company.state" />
  <key name="street" script="company.street" />
  <key name="url" script="company.url" />
</generate>

Finance

Generates and validates finance related data:

The following generators are provided:

  • Bank Entity: Entity for data around a bank.

  • BankAccount Entity: Entity for data around a bank account.

  • CreditCard Entity: Entity for data around a credit card.

Bank Entity

The generated Bank entities have the following data fields:

Property Name Type Property Description
bank_data dict All bank data including name, SWIFT, routing, etc.
name String Bank name (e.g., Chase Bank, Deutsche Bank)
swift_code String International SWIFT code for the bank
routing_number String Domestic routing number used in fund transfers
bank_code String Alias of the SWIFT code
bic String Bank Identifier Code
bin String Bank Identification Number

BankAccount Entity

The generated BankAccount entities have the following data fields:

Property Name Type Property Description
account_number String The bank account number
bank_code String The code identifying the bank
bank_name String The name of the bank
bic String The Bank Identifier Code (BIC)
iban String The International Bank Account Number (IBAN)
account_type String Type of account (INVESTMENT, SAVINGS, ...)
balance float Money balance in account
currency String Currency code (USD, EUR, GBP, JPY, ...)
created_date datetime Open account date
last_transaction_date datetime Last used date
bin String The bank identification number (BIN)

CreditCard Entity

The generated CreditCard entities have the following data fields:

Property Name Type Property Description
card_number String The credit card number
card_holder String The name of the cardholder
cvc_number String The card verification code (CVC) number
expiration_date String The expiration date of the credit card
card_type String The type of the credit card (e.g., Visa, MasterCard)
card_provider String The provider of the credit card (bank's name)
cvv String CCV number of the credit card
is_active bool The status of the credit card
credit_limit float The maximum amount granted by the credit card
current_balance float The total amount you owe at a specific point in time
issue_date datetime The date the card was first issued with the expiration date.

Ecommerce

  • EANGenerator: Generates both 8-digit and 13-digit EAN codes

  • Product Entity: Generates products including product IDs, names, descriptions, prices, categories, and other product attributes

  • Order Entity: Generates orders of buy products

Product Entity

User can use Product entity like this

1
2
3
4
5
6
7
8
9
<generate name="product_list" count="10" target="CSV">
  <variable name="product" entity="Product" />
  <key name="product_id" script="product.product_id" />
  <key name="category" script="product.category" />
  <key name="brand" script="product.brand" />
  <key name="name" script="product.name" />
  <key name="rating" script="product.rating" />
  <key name="tags" script="product.tags" />
</generate>

The generated Product entities have the following data fields:

Property Name Type Property Description
product_id String A unique product identifier (e.g., PROD3A1B2C4D)
category String The product's category
brand String The brand of the product
name String A generated product name combining brand, adjective, and noun
description String A detailed description with features, category, and benefits
price float The price of the product
sku String SKU in the format BRN-CAT-XXXXXX
condition String Product condition (e.g., NEW, USED)
availability String Availability status (e.g., IN_STOCK, OUT_OF_STOCK)
currency String Currency code (e.g., USD, EUR)
weight float Weight in kilograms
dimensions String Dimensions in the format L x W x H cm
color String Color of the product
rating float Rating between 1.0 and 5.0 (supports half-star ratings)
tags list Relevant tags based on category, brand, condition, and other factors

Order Entity

You can use Order entity like this

1
2
3
4
5
6
7
8
<generate name="order_list" count="10" target="CSV">
  <variable name="order" entity="Order" />
  <key name="order_id" script="order.order_id" />
  <key name="user_id" script="order.user_id" />
  <key name="date" script="order.date" />
  <key name="status" script="order.status" />
  <key name="total_amount" script="order.total_amount" />
</generate>

The generated Order entities have the following data fields:

Property Name Type Property Description
order_id String Unique order ID (e.g., ORD1A2B3C4)
user_id String Unique user ID who placed the order
product_list list A list of Product instances in the order
date datetime Order date (randomly within the last 12 months)
status String Order status (e.g., PENDING, SHIPPED, DELIVERED)
payment_method String Payment method used (e.g., CREDIT_CARD, PAYPAL)
shipping_method String Shipping type selected (e.g., STANDARD, EXPRESS)
shipping_address Address Shipping address generated for the order
billing_address Address Billing address (same as shipping in 80% of cases)
currency String Currency code used in the transaction (e.g., USD)
tax_amount float Tax applied (5% to 12% of subtotal)
shipping_amount float Shipping cost based on method
discount_amount float Discount applied (0–25% chance, rate 5% to 25%)
coupon_code String A coupon code if a discount was applied
notes String Optional delivery notes (e.g., "Leave at the door")
total_amount float Final order total (subtotal + tax + shipping − discount)

Healthcare

  • Hospital Entity: Entity for data around a hospital.

  • MedicalProcedure Entity: Entity for data around medical procedure.

  • MedicalDevice Entity: Entity for data around medical device.

  • Doctor Entity: Entity for data around a doctor.

  • Patient Entity: Entity for data around a patient.

Hospital Entity

The generated Hospital entities have the following data fields:

Property Type Description
hospital_id String A unique identifier for the hospital (e.g., HOSP-AB12CD34)
name String Name of the hospital, generated based on city and state
type String Type of hospital (General, Specialty, Teaching, etc.)
departments list List of departments in the hospital
services list List of services offered by the hospital
bed_count int Number of beds available, based on hospital type
staff_count int Total staff members, calculated from bed count and a staff ratio
founding_year int Year the hospital was founded (within the last 150 years)
accreditation list List of accreditations the hospital holds
emergency_services bool Indicates if the hospital provides emergency services
teaching_status bool Indicates if the hospital is a teaching hospital
website String Generated website URL based on hospital name and dataset
phone String Hospital's phone number
email String Email address derived from website domain (e.g., [email protected])

Example usage:

1
2
3
4
5
6
7
8
9
<generate name="hospital_data" count="10" target="CSV">
  <variable name="hospital" entity="Hospital" />
  <key name="id" script="hospital.hospital_id" />
  <key name="name" script="hospital.name" />
  <key name="type" script="hospital.type" />
  <key name="departments" script="hospital.departments" />
  <key name="website" script="hospital.website" />
  <key name="email" script="hospital.email" />
</generate>

MedicalProcedure Entity

The generated MedicalProcedure entities have the following data fields:

Property Type Description
procedure_id String A unique identifier for the procedure (e.g., PROC-A1B2C3D4)
procedure_code String A generated procedure code
cpt_code String A random numeric CPT (Current Procedural Terminology) code
name String Name of the medical procedure, generated contextually
category String General category of the procedure (e.g., surgical, diagnostic)
description String A full description based on the procedure’s attributes
specialty String Medical specialty associated with the procedure (e.g., cardiology)
duration_minutes int Estimated duration of the procedure in minutes
cost float Estimated cost of the procedure in local currency
requires_anesthesia bool Whether the procedure requires anesthesia
is_surgical bool Whether the procedure is surgical
is_diagnostic bool Whether the procedure is diagnostic in nature
is_preventive bool Whether the procedure is preventive
recovery_time_days int Recovery time required post-procedure, in days

Example usage:

1
2
3
4
5
6
7
8
<generate name="procedures" count="50" target="CSV">
  <variable name="procedure" entity="MedicalProcedure" />
  <key name="id" script="procedure.procedure_id" />
  <key name="code" script="procedure.procedure_code" />
  <key name="cpt" script="procedure.cpt_code" />
  <key name="name" script="procedure.name" />
  <key name="category" script="procedure.category" />
</generate>

MedicalDevice Entity

The generated MedicalDevice entities have the following data fields:

Property Name Type Property Description
device_id String A unique identifier for the medical device.
device_type String The type/category of the medical device.
manufacturer String The name of the manufacturer of the device.
model_number String The model number assigned to the device.
serial_number String The serial number of the device.
manufacture_date String The date when the device was manufactured.
expiration_date String The expiration date of the device.
last_maintenance_date String The date when the device was last maintained.
next_maintenance_date String The scheduled date for the next maintenance.
status String The current operational status of the device.
location String The location of the device within the facility.
assigned_to String The name of the person to whom the device is assigned.
specifications dict Key-value pairs detailing technical specifications.
usage_logs list A list of usage log entries.
maintenance_history list A list of past maintenance actions and their corresponding dates.

Example usage:

1
2
3
4
5
6
7
8
<generate name="medical_device_list" count="10" target="CSV">
  <variable name="device" entity="MedicalDevice" />
  <key name="device_id" script="device.device_id" />
  <key name="device_type" script="device.device_type" />
  <key name="manufacturer" script="device.manufacturer" />
  <array name="usage_logs" script="device.usage_logs" />
  <array name="maintenance_history" script="device.maintenance_history" />
</generate>

Doctor Entity

The generated Doctor entities have the following data fields:

Property Name Type Property Description
doctor_id String A unique identifier for the doctor (e.g., DOC-XXXXXXXX).
npi_number String A 10-digit National Provider Identifier number.
license_number String A formatted medical license number (e.g., AB-123456).
given_name String The doctor's first name.
family_name String The doctor's last name.
full_name String The full name (first + last) of the doctor.
gender String The doctor's gender.
birthdate datetime The doctor's date of birth.
age Integer The doctor's age in years.
specialty String The medical specialty of the doctor (e.g., Cardiology, Pediatrics).
medical_school String The name of the medical school attended by the doctor.
graduation_year int The year the doctor graduated from medical school.
years_of_experience int The number of years the doctor has been practicing.
certifications list A list of the doctor’s medical certifications.
accepting_new_patients bool Indicates if the doctor is currently accepting new patients.
email String The doctor's email address.
phone String The doctor's phone number.

Example usage:

1
2
3
4
5
6
7
<generate name="doctor_list" count="10" target="CSV">
  <variable name="doctor" entity="Doctor" />
  <key name="doctor_id" script="doctor.doctor_id" />
  <key name="full_name" script="doctor.full_name" />
  <key name="specialty" script="doctor.specialty" />
  <array name="phone" script="doctor.phone" />
</generate>

Patient Entity

The generated Patient entities have the following data fields:

Property Name Type Property Description
patient_id String A unique identifier for the patient (e.g., PAT-XXXXXXXX).
medical_record_number String A unique medical record number (e.g., MRN-XXXXXXXX).
ssn String Social Security Number in the format XXX-XX-XXXX.
given_name String The patient’s first name.
family_name String The patient’s last name.
full_name String The full name (first + last) of the patient.
gender String The patient's gender.
birthdate datetime The patient’s date of birth.
age int The patient’s age in years.
blood_type String The patient’s blood type (e.g., A+, O-).
height_cm float The patient’s height in centimeters.
weight_kg float The patient’s weight in kilograms.
bmi float The patient’s Body Mass Index.
allergies list A list of the patient’s known allergies.
medications list A list of medications the patient is taking.
conditions list A list of medical conditions diagnosed in the patient.
emergency_contact dict A dictionary of emergency contact info (name, phone, relationship).
insurance_provider String The patient’s insurance provider name.
insurance_policy_number String The patient’s insurance policy number (e.g., ABC-12345678).

Example usage:

1
2
3
4
5
6
7
<generate name="patient_list" count="10" target="CSV">
  <variable name="patient" entity="Patient" />
  <key name="patient_id" script="patient.patient_id" />
  <key name="full_name" script="patient.full_name" />
  <array name="allergies" script="patient.allergies" />
  <nestedKey name="emergency_contact" script="patient.emergency_contact"/>
</generate>

Insurance

  • InsuranceCompany Entity: Entity for data around Insurance company information.

  • InsuranceProduct Entity: Entity for data around Insurance product information.

  • InsuranceCoverage Entity: Entity for data around Insurance coverage information.

  • InsurancePolicy Entity: Entity for data around Insurance policy information.

InsuranceCompany Entity

The generated InsuranceCompany entities have the following data fields:

Property Name Type Property Description
id String A unique identifier for the insurance company.
name String The official name of the insurance company.
code String A unique code assigned to the insurance company.
founded_year String The year the insurance company was founded.
headquarters String The location of the company's headquarters.
website String The official website of the insurance company.

Example usage:

1
2
3
4
5
6
7
<generate name="insurance_company_list" count="10" target="CSV">
  <variable name="insurance_company" entity="InsuranceCompany" />
  <key name="id" script="insurance_company.id" />
  <key name="name" script="insurance_company.name" />
  <key name="code" script="insurance_company.code" />
  <key name="website" script="insurance_company.website"/>
</generate>

InsuranceProduct Entity

The generated InsuranceProduct entities have the following data fields:

Property Name Type Description
id String A unique identifier for the insurance product.
type String The type/category of the insurance product (e.g., health, auto, life).
code String A unique code assigned to the insurance product.
description String A detailed description of the insurance product.

Example usage:

1
2
3
4
5
6
7
<generate name="insurance_product_list" count="10" target="CSV">
  <variable name="insurance_product" entity="InsuranceProduct" />
  <key name="id" script="insurance_product.id" />
  <key name="type" script="insurance_product.type" />
  <key name="code" script="insurance_product.code" />
  <key name="description" script="insurance_product.description"/>
</generate>

InsuranceCoverage Entity

The generated InsuranceCoverage entities have the following data fields:

Property Name Type Description
name String The name of the coverage (e.g., "Medical Expenses", "Accident Coverage").
code String A unique identifier for the coverage.
product_code String The insurance product's code associated with this coverage.
description String A detailed description of the coverage.
min_coverage String The minimum coverage amount offered.
max_coverage String The maximum coverage amount available.

Example usage:

1
2
3
4
5
6
7
8
9
<generate name="insurance_coverage_list" count="10" target="CSV">
  <variable name="insurance_coverage" entity="InsuranceCoverage" />
  <key name="name" script="insurance_coverage.name" />
  <key name="code" script="insurance_coverage.code" />
  <key name="product_code" script="insurance_coverage.product_code" />
  <key name="description" script="insurance_coverage.description" />
  <key name="min_coverage" script="insurance_coverage.min_coverage" />
  <key name="max_coverage" script="insurance_coverage.max_coverage" />
</generate>

InsurancePolicy Entity

The generated InsurancePolicy entities have the following data fields:

Property Name Type Description
id String A unique identifier for the insurance policy.
company InsuranceCompany The insurance company issuing the policy.
product InsuranceProduct The insurance product associated with this policy.
policy_holder Person The person holding the insurance policy.
premium float The premium amount for the policy.
premium_frequency String The frequency of premium payments (e.g., "monthly", "yearly").
start_date data The policy's start date.
end_date data The policy's end date.
status String The policy status ("active", "inactive", or "cancelled").
created_date datetime The date and time when the policy was created.

Example usage:

1
2
3
4
5
6
7
<generate name="insurance_policy_list" count="10" target="CSV">
  <variable name="insurance_policy" entity="InsurancePolicy" />
  <key name="id" script="insurance_policy.id" />
  <key name="premium" script="insurance_policy.premium" />
  <key name="premium_frequency" script="insurance_policy.premium_frequency" />
  <key name="status" script="insurance_policy.status" />
</generate>

Public Sector

  • AdministrationOffice Entity: Entity for data around administration office.

  • EducationalInstitution Entity: Entity for data around educational institution.

  • PoliceOfficer Entity: Entity for data around police officer.

AdministrationOffice

The generated AdministrationOffice entities have the following data fields:

Property Name Type Description
office_id String A unique identifier for the administration office.
address Address The physical address of the office.
name String The official name of the office.
type String The type of administration office (e.g., municipal, federal).
jurisdiction String The jurisdiction the office operates under.
founding_year int The year the office was established.
staff_count int The number of staff members working at the office.
annual_budget float The annual budget allocated to the office (in dollars).
hours_of_operation dict Office working hours mapped by day (e.g., "Monday": "9 AM - 5 PM").
website String The official website of the office.
email String The contact email address of the office.
phone String The contact phone number of the office.
services list A list of public services provided by the office.
departments list A list of departments within the office.
leadership dict Key leadership positions mapped to names (e.g., "Director": "John Doe").

Example usage:

1
2
3
4
5
6
7
8
<generate name="administration_office_list" count="10" target="CSV">
  <variable name="administration_office" entity="AdministrationOffice" />
  <key name="office_id" script="administration_office.office_id" />
  <key name="name" script="administration_office.name" />
  <key name="website" script="administration_office.website" />
  <key name="email" script="administration_office.email" />
  <key name="phone" script="administration_office.phone" />
</generate>

EducationalInstitution

The generated EducationalInstitution entities have the following data fields:

Property Name Type Description
institution_id String A unique identifier for the institution.
name String The name of the institution, generated based on location and type.
type String The type of institution (e.g., Public School, University, College).
level String The education level (e.g., Elementary, High School, Undergraduate).
founding_year int The year the institution was founded.
student_count int The number of students enrolled.
staff_count int The number of staff members.
website String The institution's website URL.
email String The institution's official email address.
phone String The institution's phone number.
programs list A list of educational programs offered.
accreditations list A list of accreditations the institution has received.
facilities list A list of available facilities at the institution.
address Address The institution's physical address.

Example usage:

1
2
3
4
5
6
7
8
9
<generate name="educational_institution_list" count="10" target="CSV">
  <variable name="institution" entity="EducationalInstitution" />
  <key name="institution_id" script="institution.institution_id" />
  <key name="name" script="institution.name" />
  <key name="founding_year" script="institution.founding_year" />
  <key name="website" script="institution.website" />
  <key name="email" script="institution.email" />
  <key name="phone" script="institution.phone" />
</generate>

PoliceOfficer

The generated PoliceOfficer entities have the following data fields:

Property Name Type Description
officer_id String Unique identifier for the police officer.
badge_number String Unique badge number assigned to the officer.
given_name String First name of the officer.
family_name String Last name of the officer.
full_name String Full name of the officer.
gender String Gender of the officer.
birthdate String Birthdate of the officer in YYYY-MM-DD format.
age int Age of the officer in years.
rank String Rank of the officer.
department String Department where the officer works.
unit String Specific unit assigned within the department.
hire_date String Date the officer was hired in YYYY-MM-DD format.
years_of_service int Number of years the officer has served.
certifications list Certifications held by the officer.
languages list Languages spoken by the officer.
shift String Officer’s assigned shift schedule.
email String Officer’s official email address.
phone String Contact phone number of the officer.
address Address Address details of the officer.

Example usage:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
<generate name="police_officer_list" count="10" target="CSV">
  <variable name="police_officer" entity="PoliceOfficer" />
  <key name="officer_id" script="police_officer.officer_id" />
  <key name="badge_number" script="police_officer.badge_number" />
  <key name="department" script="police_officer.department" />
  <key name="years_of_service" script="police_officer.years_of_service" />
  <array name="certifications" script="police_officer.certifications" />
  <array name="languages" script="police_officer.languages" />
  <key name="shift" script="police_officer.shift" />
</generate>

BR

Provides objects specific to Brazil:

  • CNPJGenerator: Generates CNPJs (Cadastro Nacional da Pessoa Jurídica)

  • CPFGenerator: Generates CPFs (Cadastro de Pessoa Fisica)

US

Provides objects specific for the United States of America:

  • SSNGenerator: Generates Social Security Numbers

Faker

The faker package provides the Generator object with the Faker Library of Python.

  • DataFakerGenerator : Generates data for many topics such as bank, color, currency, file, geo...

Because this Generator has many topics, each topic has many properties, you have to choose the Provider Name and put it into the 'generator' as parameters (like this generator="DataFakerGenerator('faker_provider_name')").

Optionally you may want to define the Locale like this generator="DataFakerGenerator('faker_provider_name', locale='de_AT')"

You can use the DataFakerGenerator like this:

1
2
3
4
5
<generate name="faker_sample" count="5" target="CSV">
  <key name="job" generator="DataFakerGenerator('job', locale='en_US')" />
  <key name="name" generator="DataFakerGenerator('user_agent', locale='en_US')" />
  <key name="address" generator="DataFakerGenerator('hostname', locale='en_US')" />
</generate>

to get output similar to this:

1
2
3
4
5
6
job|name|address
Legal secretary|Mozilla/5.0 (compatible; MSIE 5.0; Windows NT 6.1; Trident/3.1)|lt-82.wilson-robbins.biz
Engineer, production|Mozilla/5.0 (Macintosh; Intel Mac OS X 10_7_6 rv:3.0; ka-GE) AppleWebKit/532.36.2 (KHTML, like Gecko) Version/4.0 Safari/532.36.2|db-21.gonzalez.com
Art therapist|Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/534.0 (KHTML, like Gecko) Chrome/50.0.825.0 Safari/534.0|db-99.weeks-diaz.net
Fast food restaurant manager|Mozilla/5.0 (Macintosh; PPC Mac OS X 10_12_8 rv:5.0; raj-IN) AppleWebKit/532.38.5 (KHTML, like Gecko) Version/4.1 Safari/532.38.5|db-12.martin.com
Publishing rights manager|Mozilla/5.0 (Linux; Android 4.3.1) AppleWebKit/531.0 (KHTML, like Gecko) Chrome/58.0.897.0 Safari/531.0|email-63.price.com

Supported topics:

FakerGenerator can generate data for multiple topics. Learn more about available providers in the Faker Docs.

Supported Locales

Locales may vary depending on the data you create and not available for all datasets.

Language code
Bulgarian bg
Catalan ca, ca_CAT, da_DK
German de, de_AT, de_CH
English en, en_AU, en_au_ocker, en_BORK, en_CA, en_GB, en_IND, en_MS, en_NEP, en_NG, en_NZ, en_PAK, en_SG, en_UG, en_US, en_ZA
Spanish es, es_MX
Finnish fi_FI
French fr
Hungarian hu
Indonesian in_ID
Italian it
Japanese ja
Korean ko
Norwegian Bokmål nb_NO
Dutch nl
Polish pl
Portuguese pt, pt_BR
Russian ru
Slovak sk
Swedish sv, sv_SE
Turkish tr
Ukrainian uk
Vietnamese vi
Chinese zh_CN, zh_TW