mex.common.models.base package¶
Submodules¶
mex.common.models.base.entity module¶
- class mex.common.models.base.entity.BaseEntity¶
Bases:
BaseModel
Abstract base model for extracted data, merged item and rule set classes.
This class gives type hints for an identifier field, the frozen entityType field and the frozen class variable stemType. Subclasses should implement all three fields while setting the correct identifier type as well as the correct literal values for the entity and stem types.
- model_computed_fields: ClassVar[Dict[str, ComputedFieldInfo]] = {}¶
A dictionary of computed field names and their corresponding ComputedFieldInfo objects.
- model_config: ClassVar[ConfigDict] = {'extra': 'forbid', 'populate_by_name': True, 'str_max_length': 100000, 'str_min_length': 1, 'str_strip_whitespace': True, 'use_enum_values': True, 'validate_assignment': True, 'validate_default': True}¶
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- model_fields: ClassVar[Dict[str, FieldInfo]] = {}¶
Metadata about the fields defined on the model, mapping of field names to [FieldInfo][pydantic.fields.FieldInfo] objects.
This replaces Model.__fields__ from Pydantic V1.
mex.common.models.base.extracted_data module¶
- class mex.common.models.base.extracted_data.ExtractedData(*, hadPrimarySource: MergedPrimarySourceIdentifier, identifierInPrimarySource: Annotated[str, MinLen(min_length=1), MaxLen(max_length=1000), _PydanticGeneralMetadata(pattern='^[^\\n\\r]+$')])¶
Bases:
BaseEntity
Base model for all extracted data classes.
This class adds two important attributes for metadata provenance: hadPrimarySource and identifierInPrimarySource, which are used to uniquely identify an item in its original primary source. The attribute stableTargetId has to be set by each concrete subclass, like ExtractedPerson, because it needs to have the correct type, e.g. MergedPersonIdentifier.
This class also adds a validator to automatically set identifiers for provenance. See below, for a full description.
- _get_identifier(identifier_type: type[_ExtractedIdentifierT]) _ExtractedIdentifierT ¶
Consult the identity provider to get the identifier for this item.
- Parameters:
identifier_type – ExtractedIdentifier-subclass to cast the identifier to
- Returns:
Identifier of the correct type
- _get_stable_target_id(identifier_type: type[_MergedIdentifierT]) _MergedIdentifierT ¶
Consult the identity provider to get the stableTargetId for this item.
- Parameters:
identifier_type – MergedIdentifier-subclass to cast the identifier to
- Returns:
StableTargetId of the correct type
- entityType: str¶
- hadPrimarySource: Annotated[MergedPrimarySourceIdentifier, FieldInfo(annotation=NoneType, required=True, description='The stableTargetId of the primary source, that this item was extracted from. This field is mandatory for all extracted items to aid with data provenance. Extracted primary sources also have this field and are all extracted from a static primary source for MEx. The extracted primary source for MEx has its own merged item as a primary source.', frozen=True)]¶
- identifierInPrimarySource: Annotated[str, FieldInfo(annotation=NoneType, required=True, description='This is the identifier the original item had in its source system. It is only unique amongst items coming from the same system, because identifier formats are likely to overlap between systems. The value for `identifierInPrimarySource` is therefore only unique in composition with `hadPrimarySource`. MEx uses this composite key to assign a stable and globally unique `identifier` per extracted item.', examples=['123456', 'item-501', 'D7/x4/zz.final3'], frozen=True, metadata=[MinLen(min_length=1), MaxLen(max_length=1000), _PydanticGeneralMetadata(pattern='^[^\\n\\r]+$')])]¶
- model_computed_fields: ClassVar[Dict[str, ComputedFieldInfo]] = {}¶
A dictionary of computed field names and their corresponding ComputedFieldInfo objects.
- model_config: ClassVar[ConfigDict] = {'extra': 'forbid', 'populate_by_name': True, 'str_max_length': 100000, 'str_min_length': 1, 'str_strip_whitespace': True, 'use_enum_values': True, 'validate_assignment': True, 'validate_default': True}¶
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- model_fields: ClassVar[Dict[str, FieldInfo]] = {'hadPrimarySource': FieldInfo(annotation=MergedPrimarySourceIdentifier, required=True, description='The stableTargetId of the primary source, that this item was extracted from. This field is mandatory for all extracted items to aid with data provenance. Extracted primary sources also have this field and are all extracted from a static primary source for MEx. The extracted primary source for MEx has its own merged item as a primary source.', frozen=True), 'identifierInPrimarySource': FieldInfo(annotation=str, required=True, description='This is the identifier the original item had in its source system. It is only unique amongst items coming from the same system, because identifier formats are likely to overlap between systems. The value for `identifierInPrimarySource` is therefore only unique in composition with `hadPrimarySource`. MEx uses this composite key to assign a stable and globally unique `identifier` per extracted item.', examples=['123456', 'item-501', 'D7/x4/zz.final3'], frozen=True, metadata=[MinLen(min_length=1), MaxLen(max_length=1000), _PydanticGeneralMetadata(pattern='^[^\\n\\r]+$')])}¶
Metadata about the fields defined on the model, mapping of field names to [FieldInfo][pydantic.fields.FieldInfo] objects.
This replaces Model.__fields__ from Pydantic V1.
- stemType: ClassVar¶
mex.common.models.base.field_info module¶
mex.common.models.base.filter module¶
- class mex.common.models.base.filter.EntityFilter(*, fieldInPrimarySource: str, locationInPrimarySource: str | None = None, examplesInPrimarySource: list[str] | None = None, mappingRules: Annotated[list[EntityFilterRule], MinLen(min_length=1)], comment: str | None = None)¶
Bases:
BaseModel
Entity filter model.
- comment: str | None¶
- examplesInPrimarySource: list[str] | None¶
- fieldInPrimarySource: str¶
- locationInPrimarySource: str | None¶
- mappingRules: Annotated[list[EntityFilterRule], FieldInfo(annotation=NoneType, required=True, metadata=[MinLen(min_length=1)])]¶
- model_computed_fields: ClassVar[Dict[str, ComputedFieldInfo]] = {}¶
A dictionary of computed field names and their corresponding ComputedFieldInfo objects.
- model_config: ClassVar[ConfigDict] = {'extra': 'forbid'}¶
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- model_fields: ClassVar[Dict[str, FieldInfo]] = {'comment': FieldInfo(annotation=Union[str, NoneType], required=False, default=None), 'examplesInPrimarySource': FieldInfo(annotation=Union[list[str], NoneType], required=False, default=None), 'fieldInPrimarySource': FieldInfo(annotation=str, required=True), 'locationInPrimarySource': FieldInfo(annotation=Union[str, NoneType], required=False, default=None), 'mappingRules': FieldInfo(annotation=list[EntityFilterRule], required=True, metadata=[MinLen(min_length=1)])}¶
Metadata about the fields defined on the model, mapping of field names to [FieldInfo][pydantic.fields.FieldInfo] objects.
This replaces Model.__fields__ from Pydantic V1.
- class mex.common.models.base.filter.EntityFilterRule(*, forValues: list[str] | None = None, rule: str | None = None)¶
Bases:
BaseModel
Entity filter rule model.
- forValues: list[str] | None¶
- model_computed_fields: ClassVar[Dict[str, ComputedFieldInfo]] = {}¶
A dictionary of computed field names and their corresponding ComputedFieldInfo objects.
- model_config: ClassVar[ConfigDict] = {'extra': 'forbid'}¶
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- model_fields: ClassVar[Dict[str, FieldInfo]] = {'forValues': FieldInfo(annotation=Union[list[str], NoneType], required=False, default=None), 'rule': FieldInfo(annotation=Union[str, NoneType], required=False, default=None)}¶
Metadata about the fields defined on the model, mapping of field names to [FieldInfo][pydantic.fields.FieldInfo] objects.
This replaces Model.__fields__ from Pydantic V1.
- rule: str | None¶
- mex.common.models.base.filter.generate_entity_filter_schema(extracted_model: type[AnyExtractedModel]) type[BaseModel] ¶
Create a mapping schema for an entity filter for an extracted model class.
Example entity filter: If activity starts before 2016: do not extract.
- Parameters:
extracted_model – a pydantic model for an extracted model class
- Returns:
model of the mapping schema for an entity filter
mex.common.models.base.mapping module¶
- class mex.common.models.base.mapping.GenericField(*, fieldInPrimarySource: str, locationInPrimarySource: str | None = None, examplesInPrimarySource: list[str] | None = None, mappingRules: Annotated[list[GenericRule], MinLen(min_length=1)], comment: str | None = None)¶
Bases:
BaseModel
Generic Field model.
- comment: str | None¶
- examplesInPrimarySource: list[str] | None¶
- fieldInPrimarySource: str¶
- locationInPrimarySource: str | None¶
- mappingRules: Annotated[list[GenericRule], FieldInfo(annotation=NoneType, required=True, metadata=[MinLen(min_length=1)])]¶
- model_computed_fields: ClassVar[Dict[str, ComputedFieldInfo]] = {}¶
A dictionary of computed field names and their corresponding ComputedFieldInfo objects.
- model_config: ClassVar[ConfigDict] = {'extra': 'forbid'}¶
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- model_fields: ClassVar[Dict[str, FieldInfo]] = {'comment': FieldInfo(annotation=Union[str, NoneType], required=False, default=None), 'examplesInPrimarySource': FieldInfo(annotation=Union[list[str], NoneType], required=False, default=None), 'fieldInPrimarySource': FieldInfo(annotation=str, required=True), 'locationInPrimarySource': FieldInfo(annotation=Union[str, NoneType], required=False, default=None), 'mappingRules': FieldInfo(annotation=list[GenericRule], required=True, metadata=[MinLen(min_length=1)])}¶
Metadata about the fields defined on the model, mapping of field names to [FieldInfo][pydantic.fields.FieldInfo] objects.
This replaces Model.__fields__ from Pydantic V1.
- class mex.common.models.base.mapping.GenericRule(*, forValues: list[str] | None = None, setValues: list[Any] | None = None, rule: str | None = None)¶
Bases:
BaseModel
Generic mapping rule model.
- forValues: list[str] | None¶
- model_computed_fields: ClassVar[Dict[str, ComputedFieldInfo]] = {}¶
A dictionary of computed field names and their corresponding ComputedFieldInfo objects.
- model_config: ClassVar[ConfigDict] = {'extra': 'forbid'}¶
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- model_fields: ClassVar[Dict[str, FieldInfo]] = {'forValues': FieldInfo(annotation=Union[list[str], NoneType], required=False, default=None), 'rule': FieldInfo(annotation=Union[str, NoneType], required=False, default=None), 'setValues': FieldInfo(annotation=Union[list[Any], NoneType], required=False, default=None)}¶
Metadata about the fields defined on the model, mapping of field names to [FieldInfo][pydantic.fields.FieldInfo] objects.
This replaces Model.__fields__ from Pydantic V1.
- rule: str | None¶
- setValues: list[Any] | None¶
- mex.common.models.base.mapping.generate_mapping_schema(extracted_model: type[AnyExtractedModel]) type[BaseModel] ¶
Create a mapping schema the MEx extracted model class.
Pydantic models are dynamically created for the given entity type from depending on the respective fields and their types.
- Parameters:
extracted_model – a pydantic model for an extracted model class
- Returns:
dynamic mapping model for the provided extracted model class
mex.common.models.base.merged_item module¶
- class mex.common.models.base.merged_item.MergedItem¶
Bases:
BaseEntity
Base model for all merged item classes.
- entityType: str¶
- model_computed_fields: ClassVar[Dict[str, ComputedFieldInfo]] = {}¶
A dictionary of computed field names and their corresponding ComputedFieldInfo objects.
- model_config: ClassVar[ConfigDict] = {'extra': 'forbid', 'populate_by_name': True, 'str_max_length': 100000, 'str_min_length': 1, 'str_strip_whitespace': True, 'use_enum_values': True, 'validate_assignment': True, 'validate_default': True}¶
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- model_fields: ClassVar[Dict[str, FieldInfo]] = {}¶
Metadata about the fields defined on the model, mapping of field names to [FieldInfo][pydantic.fields.FieldInfo] objects.
This replaces Model.__fields__ from Pydantic V1.
- stemType: ClassVar¶
mex.common.models.base.model module¶
- class mex.common.models.base.model.BaseModel¶
Bases:
BaseModel
Common base class for all MEx model classes.
- classmethod _convert_list_to_non_list(field_name: str, value: list[Any]) Any ¶
Convert a list value to a non-list value by unpacking it if possible.
- classmethod _convert_non_list_to_list(field_name: str, value: Any) list[Any] | None ¶
Convert a non-list value to a list value by wrapping it in a list.
- classmethod _fix_value_listyness_for_field(field_name: str, value: Any) Any ¶
Check actual and desired shape of a value and fix it if necessary.
- classmethod _get_alias_lookup() dict[str, str] ¶
Build a cached mapping from field alias to field names.
- classmethod _get_field_names_allowing_none() list[str] ¶
Build a cached list of fields can be set to None.
- classmethod _get_list_field_names() list[str] ¶
Build a cached list of fields that look like lists.
- checksum() str ¶
Calculate md5 checksum for this model.
- classmethod fix_listyness(data: Any, handler: ValidatorFunctionWrapHandler) Any ¶
Adjust the listyness of to-be-parsed data to match the desired shape.
If that data is a Mapping and the model defines a list[T] field but the raw data contains just a value of type T, it will be wrapped into a list. If the raw data contains a literal None, but the list field is defined as required, we substitute an empty list.
If the model does not expect a list, but the raw data contains a list with no entries, it will be substituted with None. If the raw data contains exactly one entry, then it will be unpacked from the list. If it contains more than one entry however, an error is raised, because we would not know which to choose.
- Parameters:
data – Raw data or instance to be parsed
handler – Validator function wrap handler
- Returns:
data with fixed list shapes
- classmethod get_all_fields() dict[str, GenericFieldInfo] ¶
Return a combined dict of defined and computed fields.
- model_computed_fields: ClassVar[Dict[str, ComputedFieldInfo]] = {}¶
A dictionary of computed field names and their corresponding ComputedFieldInfo objects.
- model_config: ClassVar[ConfigDict] = {'extra': 'ignore', 'populate_by_name': True, 'str_max_length': 100000, 'str_min_length': 1, 'str_strip_whitespace': True, 'use_enum_values': True, 'validate_assignment': True, 'validate_default': True}¶
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- model_fields: ClassVar[Dict[str, FieldInfo]] = {}¶
Metadata about the fields defined on the model, mapping of field names to [FieldInfo][pydantic.fields.FieldInfo] objects.
This replaces Model.__fields__ from Pydantic V1.
- classmethod model_json_schema(by_alias: bool = True, ref_template: str = '#/$defs/{model}', schema_generator: type[~pydantic.json_schema.GenerateJsonSchema] = <class 'mex.common.models.base.schema.JsonSchemaGenerator'>, mode: ~typing.Literal['validation', 'serialization'] = 'validation') dict[str, Any] ¶
Generates a JSON schema for a model class.
- Parameters:
by_alias – Whether to use attribute aliases or not.
ref_template – The reference template.
schema_generator – Overriding the logic used to generate the JSON schema
mode – The mode in which to generate the schema.
- Returns:
The JSON schema for the given model class.
- classmethod verify_computed_field_consistency(data: Any, handler: ValidatorFunctionWrapHandler) Any ¶
Validate that parsed values for computed fields are consistent.
Parsing a dictionary with a value for a computed field that is consistent with what that field would have computed anyway is allowed. Omitting values for computed fields is perfectly valid as well. However, if the parsed value is different from the computed value, a validation error is raised.
- Parameters:
data – Raw data or instance to be parsed
handler – Validator function wrap handler
- Returns:
data with consistent computed fields.
mex.common.models.base.rules module¶
- class mex.common.models.base.rules.AdditiveRule¶
Bases:
BaseEntity
Base rule to add values to merged items.
- entityType: str¶
- model_computed_fields: ClassVar[Dict[str, ComputedFieldInfo]] = {}¶
A dictionary of computed field names and their corresponding ComputedFieldInfo objects.
- model_config: ClassVar[ConfigDict] = {'extra': 'forbid', 'populate_by_name': True, 'str_max_length': 100000, 'str_min_length': 1, 'str_strip_whitespace': True, 'use_enum_values': True, 'validate_assignment': True, 'validate_default': True}¶
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- model_fields: ClassVar[Dict[str, FieldInfo]] = {}¶
Metadata about the fields defined on the model, mapping of field names to [FieldInfo][pydantic.fields.FieldInfo] objects.
This replaces Model.__fields__ from Pydantic V1.
- stemType: ClassVar¶
- class mex.common.models.base.rules.PreventiveRule¶
Bases:
BaseEntity
Base rule to prevent primary sources for fields of merged items.
- entityType: str¶
- model_computed_fields: ClassVar[Dict[str, ComputedFieldInfo]] = {}¶
A dictionary of computed field names and their corresponding ComputedFieldInfo objects.
- model_config: ClassVar[ConfigDict] = {'extra': 'forbid', 'populate_by_name': True, 'str_max_length': 100000, 'str_min_length': 1, 'str_strip_whitespace': True, 'use_enum_values': True, 'validate_assignment': True, 'validate_default': True}¶
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- model_fields: ClassVar[Dict[str, FieldInfo]] = {}¶
Metadata about the fields defined on the model, mapping of field names to [FieldInfo][pydantic.fields.FieldInfo] objects.
This replaces Model.__fields__ from Pydantic V1.
- stemType: ClassVar¶
- class mex.common.models.base.rules.RuleSet¶
Bases:
BaseEntity
Base class for a set of an additive, subtractive and preventive rule.
- model_computed_fields: ClassVar[Dict[str, ComputedFieldInfo]] = {}¶
A dictionary of computed field names and their corresponding ComputedFieldInfo objects.
- model_config: ClassVar[ConfigDict] = {'extra': 'forbid', 'populate_by_name': True, 'str_max_length': 100000, 'str_min_length': 1, 'str_strip_whitespace': True, 'use_enum_values': True, 'validate_assignment': True, 'validate_default': True}¶
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- model_fields: ClassVar[Dict[str, FieldInfo]] = {}¶
Metadata about the fields defined on the model, mapping of field names to [FieldInfo][pydantic.fields.FieldInfo] objects.
This replaces Model.__fields__ from Pydantic V1.
- class mex.common.models.base.rules.SubtractiveRule¶
Bases:
BaseEntity
Base rule to subtract values from merged items.
- entityType: str¶
- model_computed_fields: ClassVar[Dict[str, ComputedFieldInfo]] = {}¶
A dictionary of computed field names and their corresponding ComputedFieldInfo objects.
- model_config: ClassVar[ConfigDict] = {'extra': 'forbid', 'populate_by_name': True, 'str_max_length': 100000, 'str_min_length': 1, 'str_strip_whitespace': True, 'use_enum_values': True, 'validate_assignment': True, 'validate_default': True}¶
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- model_fields: ClassVar[Dict[str, FieldInfo]] = {}¶
Metadata about the fields defined on the model, mapping of field names to [FieldInfo][pydantic.fields.FieldInfo] objects.
This replaces Model.__fields__ from Pydantic V1.
- stemType: ClassVar¶
mex.common.models.base.schema module¶
- class mex.common.models.base.schema.JsonSchemaGenerator(by_alias: bool = True, ref_template: str = '#/$defs/{model}')¶
Bases:
GenerateJsonSchema
Customization of the pydantic class for generating JSON schemas.
- handle_ref_overrides(json_schema: Dict[str, Any]) Dict[str, Any] ¶
Disable pydantic behavior to wrap top-level $ref keys in an allOf.
- For example, pydantic would convert
{“$ref”: “#/$defs/APIType”, “examples”: [“api-type-1”]}
- into
{“allOf”: {“$ref”: “#/$defs/APIType”}, “examples”: [“api-type-1”]}
which is in fact recommended by JSON schema, but we need to disable this to stay compatible with mex-editor and mex-model.