Architecture

Architecture

Schema adopts a three-layer architecture to describe data structure.

Three-Layer Architecture

LevelCorresponding DataDescription
MetadataDatasetsManages multiple tables
SchemaTableDefines single table
AttributeFieldDescribes single field

Architecture Relationships

Metadata (Dataset)
├── Schema (Table 1)
│   ├── Attribute (Field 1)
│   ├── Attribute (Field 2)
│   └── Attribute (Field 3)
├── Schema (Table 2)
│   ├── Attribute (Field 1)
│   └── Attribute (Field 2)
└── Schema (Table 3)
    └── Attribute (...)

Actual Data Mapping

Schema LevelPython Data TypeExample
Metadatadict[str, DataFrame]{'users': df1, 'orders': df2}
Schemapd.DataFramepd.DataFrame(...)
Attributepd.Seriesdf['user_id']

Usage

Single Table Scenario (Common)

Most cases involve only one table:

id: my_dataset
schemas:
  users:              # Single table
    id: users
    attributes:
      user_id:
        type: int
      name:
        type: str

Multi-Table Scenario

When dataset contains multiple related tables:

id: ecommerce
schemas:
  users:              # First table
    id: users
    attributes: {...}
  orders:             # Second table
    id: orders
    attributes: {...}
  products:           # Third table
    id: products
    attributes: {...}

Important Notes

  • In practice, you typically only need to define Schema and Attributes
  • Metadata level is automatically created by the system in single-table scenarios
  • For detailed configuration parameters, see Attribute Parameters documentation