Quality control¶
Quality control is a collection of metrics evaluated on a data asset.
QCMetric objects should be generated during pipelines: from raw data, during processing, and during analysis by researchers.
Every QCMetric has a aind_data_schema.quality_control.State which takes the value of the metric and compares it to some rule. Metrics can only pass or fail. Metrics that require manual evaluation are set to pending.
Details¶
Metrics¶
Each QCMetric is a single value or array of values that can be computed, or observed, about one modality in a data asset. These can have any type. Metrics should be significant: i.e. whether they pass or fail should matter for the modality. Metrics need to be human understandable. If you find yourself generating more than fifty metrics for a modality you should group them together (i.e. make the value a dictionary combining similar metrics and the rule an evaluation of multiple fields in the dictionary).
Each QCMetric has a Status. The Status should depend directly on the QCMetric.value, either by a simple function: “value>5”, or by a qualitative rule: “Field of view includes visual areas”. The QCMetric.description field should describe the rule used to set the status. Metrics can be evaluated multiple times, in which case the new status should be appended the QCMetric.status_history.
Each QCMetric is annotated with three pieces of additional metadata: the Stage during which it was evaluated, the Modality of the evaluated data, and tags.
Curations¶
If you find yourself computing a value for something smaller than an entire modality of data in an asset you are performing curation, i.e. you are determining the status of a subset of a modality in the data asset. We provide the CurationMetric for this purpose. You should put a dictionary in the CurationMetric.value field that contains a mapping between the subsets (usually neurons, ROIs, channels, etc) and their values.
QualityControl.evaluate_status()¶
You can evaluate the state of a set of metrics filtered by any combination of modalities, stages, and tags on a specific date (by default, today). When evaluating the Status of a group of metrics the following rules apply:
First, any metric that has a tag value in the QualityControl.allow_tag_failures list is ignored. This allows you to specify that certain metrics are not critical to a data asset.
Then, given the status of all the remaining metrics in the group:
If any metric is still failing, the evaluation fails
If any metric is pending and the rest pass the evaluation is pending
If all metrics pass the evaluation passes
Q: What is a metric reference?
Each QCMetric should include a QCMetric.reference. References should be publicly accessible images, figures, multi-panel figures, and videos that support the metric value/status or provide the information necessary for manual annotation.
It’s good practice to share a single multi-panel figure across multiple references to simplify viewing the quality control.
Q: What are the status options for metrics?
In our quality control a metric’s status is always PASS, PENDING (waiting for manual annotation), or FAIL.
We enforce this minimal set of states to prevent ambiguity and make it easier to build tools that can interpret the status of a data asset.
Multi-asset QC¶
During analysis there are many situations where multiple data assets need to be pulled together, often for comparison. For example, FOVs across imaging sessions or recording sessions from a chronic probe might need to get matched up across days. When a QCMetric is being calculated from multiple assets it should be tagged with Stage:MULTI_ASSET and each of its metrics needs to track the assets that were used to generate that metric in the evaluated_assets list.
Example¶
1"""Example quality control processing"""
2
3from datetime import datetime, timezone
4import argparse
5
6from aind_data_schema_models.modalities import Modality
7
8from aind_data_schema.core.quality_control import QualityControl, QCMetric, Stage, Status, QCStatus
9
10t = datetime(2022, 11, 22, 0, 0, 0, tzinfo=timezone.utc)
11
12s = QCStatus(evaluator="Automated", status=Status.PASS, timestamp=t)
13sp = QCStatus(evaluator="", status=Status.PENDING, timestamp=t)
14
15# Example of how to use a dictionary to provide options for a metric in the QC portal
16drift_value_with_options = {
17 "value": "",
18 "options": ["Low", "Medium", "High"],
19 "status": [
20 "Pass",
21 "Fail",
22 "Fail",
23 ], # when set, this field will be used to automatically parse the status, blank forces manual update
24 "type": "dropdown",
25}
26
27# Example of how to use a dictionary to provide multiple checkable flags, some of which will fail the metric
28drift_value_with_flags = {
29 "value": "",
30 "options": [
31 "No Drift",
32 "Drift visible in part of acquisition",
33 "Drift visible in entire acquisition",
34 "Sudden movement event",
35 ],
36 "status": ["Pass", "Pass", "Fail", "Fail"],
37 "type": "checkbox",
38}
39
40metrics = [
41 QCMetric(
42 name="Probe A drift",
43 modality=Modality.ECEPHYS,
44 stage=Stage.RAW,
45 description="Pass when drift map shows minimal movement",
46 value=drift_value_with_options,
47 reference="ecephys-drift-map",
48 status_history=[sp],
49 tags={
50 "probe": "Probe A",
51 },
52 ),
53 QCMetric(
54 name="Probe B drift",
55 modality=Modality.ECEPHYS,
56 stage=Stage.RAW,
57 description="Pass when drift map shows minimal movement",
58 value=drift_value_with_flags,
59 reference="ecephys-drift-map",
60 status_history=[sp],
61 tags={
62 "probe": "Probe B",
63 },
64 ),
65 QCMetric(
66 name="Probe C drift",
67 modality=Modality.ECEPHYS,
68 stage=Stage.RAW,
69 description="Pass when drift map shows minimal movement",
70 value="Low",
71 reference="ecephys-drift-map",
72 status_history=[s],
73 tags={
74 "probe": "Probe C",
75 },
76 ),
77 QCMetric(
78 name="ProbeA",
79 modality=Modality.ECEPHYS,
80 stage=Stage.RAW,
81 description="Pass when probe is present in the recording",
82 value=True,
83 status_history=[s],
84 tags={
85 "probe": "Probe A",
86 },
87 ),
88 QCMetric(
89 name="ProbeB",
90 modality=Modality.ECEPHYS,
91 stage=Stage.RAW,
92 description="Pass when probe is present in the recording",
93 value=True,
94 status_history=[s],
95 tags={
96 "probe": "Probe B",
97 },
98 ),
99 QCMetric(
100 name="ProbeC",
101 modality=Modality.ECEPHYS,
102 stage=Stage.RAW,
103 description="Pass when probe is present in the recording",
104 value=True,
105 status_history=[s],
106 tags={
107 "probe": "Probe C",
108 },
109 ),
110 QCMetric(
111 name="Video 1 frame count",
112 modality=Modality.BEHAVIOR_VIDEOS,
113 stage=Stage.RAW,
114 description="Pass when frame count matches expected",
115 value=662,
116 status_history=[s],
117 tags={
118 "video": "Video 1",
119 },
120 ),
121 QCMetric(
122 name="Video 2 num frames",
123 modality=Modality.BEHAVIOR_VIDEOS,
124 stage=Stage.RAW,
125 description="Pass when frame count matches expected",
126 value=662,
127 status_history=[s],
128 tags={
129 "video": "Video 2",
130 },
131 ),
132]
133
134q = QualityControl(
135 metrics=metrics,
136 # in visualizations split first by modality, then by probe / video tags
137 default_grouping=["modality", ("probe", "video")],
138 # allow any metrics with tag video: Video 2 to fail without failing overall QC
139 allow_tag_failures=["Video 2"],
140)
141
142if __name__ == "__main__":
143 parser = argparse.ArgumentParser()
144 parser.add_argument("--output-dir", default=None, help="Output directory for generated JSON file")
145 args = parser.parse_args()
146
147 serialized = q.model_dump_json()
148 deserialized = QualityControl.model_validate_json(serialized)
149 q.write_standard_file(output_directory=args.output_dir)
Core file¶
QualityControl¶
Collection of quality control metrics evaluated on a data asset to determine pass/fail status
Field |
Type |
Title (Description) |
|---|---|---|
|
List[QCMetric or CurationMetric] |
Evaluations |
|
|
Key experimenters (Experimenters who are responsible for quality control of this data asset) |
|
|
Notes |
|
|
Default grouping (Tag keys that should be used to group metrics hierarchically for visualization) |
|
|
Allow tag failures (List of tag values that are allowed to fail without failing the overall QC) |
|
|
Status mapping (Mapping of tags, modalities, and stages to their evaluated status, automatically computed) |
Model definitions¶
CurationHistory¶
Schema to track curator name and timestamp for curation events
Field |
Type |
Title (Description) |
|---|---|---|
|
|
Curator |
|
|
Timestamp |
CurationMetric¶
Description of a curation metric
Field |
Type |
Title (Description) |
|---|---|---|
|
|
Curation value |
|
|
Curation type |
|
List[CurationHistory] |
Curation history |
|
|
Metric name |
|
Modality |
|
|
Evaluation stage |
|
|
List[QCStatus] |
Metric status history |
|
|
Metric description (Describes the measured value and the rule that links the value and status.) |
|
|
Metric reference image URL or plot type |
|
|
Tags (Tags group QCMetric objects. Unique keys define groups of tags, for example {‘probe’: ‘probeA’}.) |
|
|
List of asset names that this metric depends on (Set to None except when a metric’s calculation required data coming from a different data asset.) |
QCMetric¶
Description of a single quality control metric
Field |
Type |
Title (Description) |
|---|---|---|
|
|
Metric name |
|
Modality |
|
|
Evaluation stage |
|
|
|
Metric value |
|
List[QCStatus] |
Metric status history |
|
|
Metric description (Describes the measured value and the rule that links the value and status.) |
|
|
Metric reference image URL or plot type |
|
|
Tags (Tags group QCMetric objects. Unique keys define groups of tags, for example {‘probe’: ‘probeA’}.) |
|
|
List of asset names that this metric depends on (Set to None except when a metric’s calculation required data coming from a different data asset.) |
QCStatus¶
Description of a QC status, set by an evaluator
Field |
Type |
Title (Description) |
|---|---|---|
|
|
Status evaluator full name |
|
Status |
|
|
|
Status date |
Stage¶
Quality control stage
When during data processing the QC metrics were derived.
Name |
Value |
|---|---|
|
|
|
|
|
|
|
|
Status¶
QC Status
Name |
Value |
|---|---|
|
|
|
|
|
|