Model

Link to code

The Model metadata schema is an extension of the Processing schema tailored to model weights and other data and code artifacts underlying machine learning models - these may be trained on one dataset and evaluated on others, and may be intended to undergo further training iteratively in future versions.

Thus new evaluations and training steps can easily be appended for new model versions. This metadata should be documented for any models that see widespread internal use or public release, in order to facilitate model reuse and document provenance.

Core file

Model

Description of a machine learning model including architecture, training, and evaluation details

Field

Type

Title (Description)

name

str

Name

version

str

Version

example_run_code

Code

Example run code (Code to run the model, possibly including example parameters/data)

architecture

ModelArchitecture

architecture (Model architecture / type of model)

software_framework

Optional[Software]

Software framework

architecture_parameters

Optional[dict]

Architecture parameters (Parameters of model architecture, such as input signature or number of layers.)

intended_use

str

Intended model use (Semantic description of intended use)

limitations

Optional[str]

Model limitations

training

List[ModelTraining or ModelPretraining]

Training

evaluations

List[ModelEvaluation]

Evaluations

notes

Optional[str]

Notes

Model definitions

ModelEvaluation

Description of model evaluation

Field

Type

Title (Description)

process_type

ProcessName

performance

List[PerformanceMetric]

Evaluation performance

name

str

Name ((‘Unique name of the processing step.’, ‘ If not provided, the type will be used as the name.’))

stage

ProcessStage

Processing stage

code

Code

Code (Code used for processing)

experimenters

List[str]

Experimenters (People responsible for processing)

pipeline_name

Optional[str]

Pipeline name (Pipeline names must exist in Processing.pipelines)

start_date_time

datetime (timezone-aware)

Start date time

end_date_time

Optional[datetime (timezone-aware)]

End date time

output_path

Optional[AssetPath]

Output path (Path to processing outputs, if stored.)

output_parameters

Optional[dict]

Outputs (Output parameters)

notes

Optional[str]

Notes

resources

Optional[ResourceUsage]

Process resource usage

ModelPretraining

Description of model pretraining

Field

Type

Title (Description)

source_url

str

Pretrained source URL (URL for pretrained weights)

ModelTraining

Description of model training

Field

Type

Title (Description)

process_type

ProcessName

train_performance

List[PerformanceMetric]

Training performance (Performance on training set)

test_performance

Optional[List[PerformanceMetric]]

Test performance (Performance on test data, evaluated during training)

test_evaluation_method

Optional[str]

Test evaluation method (Approach to cross-validation or Train/test splitting)

name

str

Name ((‘Unique name of the processing step.’, ‘ If not provided, the type will be used as the name.’))

stage

ProcessStage

Processing stage

code

Code

Code (Code used for processing)

experimenters

List[str]

Experimenters (People responsible for processing)

pipeline_name

Optional[str]

Pipeline name (Pipeline names must exist in Processing.pipelines)

start_date_time

datetime (timezone-aware)

Start date time

end_date_time

Optional[datetime (timezone-aware)]

End date time

output_path

Optional[AssetPath]

Output path (Path to processing outputs, if stored.)

output_parameters

Optional[dict]

Outputs (Output parameters)

notes

Optional[str]

Notes

resources

Optional[ResourceUsage]

Process resource usage

PerformanceMetric

Description of a performance metric

Field

Type

Title (Description)

name

str

Metric name

value

typing.Any

Metric value