Skip to content

Auto-Generate Model from Database

Database View is the central workspace for creating model and weighting artifacts from scanned database metadata.

As of 3.2.0, the flow is organized in the Database workbench tabs (Planning, Weighting, Schema history) with dedicated create actions.

Steps

1. Ensure you have a project

Start with an existing project or create a new one (for example, an empty project).

2. Configure a database environment

  • Open Settings β†’ Environments.
  • Add or edit your database environment.
  • Ensure credentials and connectivity are valid.

3. Scan metadata

  • Run Scan Metadata for the selected environment.
  • If metadata is outdated, run Reset Metadata and scan again.
  • Metadata scan includes all accessible schemas for the selected connection.

4. Open Database View and plan your subset

In Database View, work from left to right:

  • Zone 1 - Scope: choose environment, schema filter, and tables.
  • Columns panel: review and select columns.
  • Database workbench β†’ Planning: run Plan subset, validate dependency closure, and check subset preflight.
Database View planning and subset preflight
Database View: Zone 1 scope, subset planning, and dependency preflight

5. Create model artifacts from the workbench

Use Generate artifacts in the Planning tab.

A. Create Synthetic model (.xml)

  1. Click Create Synthetic.
  2. Enter a model file name.
  3. Set relationship source (database-backed or generated weighting files under data/).
  4. Optionally enable schema-qualified generated names.
  5. Optionally enable SQL schema script export (.scr.sql) if your target setup requires it.
Create synthetic model from database metadata
Create Synthetic: naming, relationship source, and optional SQL schema script

B. Create Anonymize model (.xml)

  1. Click Create Anonymize.
  2. Confirm source database and select target database.
  3. Optionally apply PII preselect before create and set threshold.
  4. Continue to naming and create.
Create anonymize model from database metadata
Create Anonymize: source and target settings with optional PII preselect

C. Create ML training model (.xml)

  1. Click Create ML.
  2. Confirm source.
  3. Enter file name and optionally enable schema-qualified generated names.
  4. Create the model artifact.
  5. Execute the generated DSL model so <ml-train> can train and persist ML generator versions.
Create ML training model from database metadata
Create ML: model naming and generation options

Next step: after the training run completes, validate generator versions and quality metrics in ML Generator View, then reuse approved models with source="ml://..." as described in ML Generator from Database Metadata.

6. Create weighting artifacts in the Weighting tab

Switch to Database workbench β†’ Weighting:

  1. Select exactly one table and one or more columns in the metadata scope.
  2. Review selected columns in the Weighting tab.
  3. Click Create weighting.
  4. Configure sampling:
  5. sample_size (default 1000)
  6. sampling_mode (deterministic or fresh)
  7. include_nulls (true / false)
  8. Set file name and create.

Output:

  • One selected column β†’ .wgt.csv
  • Multiple selected columns β†’ .wgt.ent.csv
Weighting tab in Database workbench
Weighting tab: selected weighting scope and create action
Weighting creation wizard from database metadata
Weighting wizard: sampling options and naming

7. Review schema drift in Schema history

Use Database workbench β†’ Schema history to review detected metadata changes across snapshots before regenerating artifacts.

Schema history and drift in Database view
Schema history: drift detection and snapshot comparison

8. Review and refine created artifacts

  • Model flows create a new DATAMIMIC model file (.xml).
  • Weighting flow creates a weighting file (.wgt.csv or .wgt.ent.csv).
  • Continue refinement in the Editor (keys, generators, scripts, converters, and targets).

Typical generated <reference> outputs from database metadata flow:

1
2
3
4
<reference name="fk_line_order" source="sourceDB" sourceType="sales.order_header">
    <field target="order_id" sourceKey="ORDER_ID"/>
    <field target="tenant_id" sourceKey="TENANT_ID"/>
</reference>
1
2
3
4
<reference name="fk_site" source="data/site_pref.wgt.ent.csv" weightColumn="sample_weight">
    <field target="site_ref" sourceKey="site_id"/>
    <field target="region_ref" sourceKey="region_code"/>
</reference>

9. Set realistic expectations for recommendations

  • Database View recommendations are assistive and continuously improving.
  • For complex schemas and production anonymization scenarios, expect manual adjustment of generated models.
  • Use project-specific review criteria to validate PII decisions, relationship handling, and business constraints before rollout.
  • Plan iterative hardening cycles as metadata or source schema evolves.

For complete <reference> behavior and mapping semantics, see Data Definition Model - Advanced Elements.