Auto-Generate Model from Database¶

Generating a DATAMIMIC model from a database enables you to quickly create data generation models based on your existing database schema. Follow the steps below to achieve this.

Steps¶

1. Ensure you have a project¶

Start with an existing project or create a new one (e.g., an empty project).

2. Add a new file in the Models section¶

Click the plus icon to add a new file in the Models section of your project.

3. Select "Model from Database" and input a file name¶

In the File Wizard, choose the option "Model from Database", provide a name for your new DATAMIMIC model file, for example, customer_table and click Next.

4. Select the data set you want to generate from the Database¶

Full Set: If you choose to generate the model based on the entire data set available in the chosen database, you will proceed to the next step, where you'll provide a file name for your DATAMIMIC model.
Sub Set: If you choose to work with a subset of the data, you will be redirected to another view where you can select specific data sets and columns to include. Learn more about selecting subsets of data.

4.1 Selecting Full Set Of database generation¶

Configure source and/or target Database:
- Choose the database type for your source and target databases. Make sure you have created the databases in the Environment Settings before.
  
  Configure Database Settings
Review and Confirm:
- Review the settings in the confirmation step and click Create to generate the DATAMIMIC model.
Review Created Model:
- A new DATAMIMIC model is created to generate the data from the specified database.
- Automatic PII Detection: During model generation, DATAMIMIC automatically scans your database metadata for fields that are likely to contain Personally Identifiable Information (PII), such as names, emails, phone numbers, addresses, and other sensitive data. These fields are replaced with the special marker #SENSITIVE in the generated model. This ensures that sensitive data is masked by default, helping you comply with data privacy regulations (such as GDPR, HIPAA, and others) and best practices for data protection.
- Why PII Detection Matters: Protecting PII is critical for organizations to avoid data breaches, maintain customer trust, and meet regulatory requirements. DATAMIMIC's built-in PII detection and masking not only saves time but also reduces risk by ensuring that sensitive fields are never exposed in your synthetic data workflows. You can review and adjust these fields as needed before generating or exporting your synthetic data, giving you full control and transparency.

4.2 Selecting a Subset of the Database¶

When you choose to generate a model from a subset of your database, you are directed to the Database View. This view gives you a detailed overview of your database environment, allowing for precise customization of your data selection and model generation. Learn more about the DATAMIMIC Database View.

Follow these steps to generate a model from a database subset:

Select Environment and Columns:
- From the database environment you created in the project, select the one you want to generate the model from.
- Choose the specific columns to include in your DATAMIMIC model. This helps you focus on relevant data.
Customize (Optional):
- Select Scripts: Choose scripts to process the selected columns.
- Choose Generators: Apply custom generators to the selected columns for optimized data generation.
Create the Model File:
- Click the Create File button to proceed.
- In the dialog, enter a name for your new model file (e.g., customer_table) and click Next.
Enter File Name
Configure Target Database (Optional):
- Select the database type for your target database. Ensure the database has been configured in Environment Settings.
Review and Confirm:
- Review your settings in the confirmation step.
- Click Create to generate the DATAMIMIC model.
Review Created Model:
- A new DATAMIMIC model file is created based on your selections from the database.

This generated model can be further refined. You can adjust it to use DATAMIMIC generators for dynamic values, allowing for more complex and tailored data generation.