mex.extractors.voxco package

Submodules

mex.extractors.voxco.extract module

mex.extractors.voxco.extract.extract_ldap_persons_voxco(voxco_resource_mappings: list[Any]) list[LDAPPerson]

Extract LDAP persons for voxco.

Parameters:

voxco_resource_mappings – list of resource mapping models with default values

Returns:

list of LDAP persons

mex.extractors.voxco.extract.extract_voxco_organizations(voxco_resource_mappings: list[Any]) dict[str, WikidataOrganization]

Search and extract voxco organization from wikidata.

Parameters:

voxco_resource_mappings – voxco resource mapping models

Returns:

Dict with organization label and WikidataOrganization

mex.extractors.voxco.extract.extract_voxco_variables() dict[str, list[VoxcoVariable]]

Extract voxco variables by loading data from mex-drop source json file.

Returns:

lists of voxco variables by json file name

mex.extractors.voxco.main module

mex.extractors.voxco.model module

class mex.extractors.voxco.model.VoxcoVariable(*, Id: int, DataType: str, Type: str, QuestionText: Annotated[str, MinLen(min_length=0)], Choices: list[str], Text: Annotated[str, MinLen(min_length=0)])

Bases: BaseModel

Model class for Voxco Variable.

Choices: list[str]
DataType: str
Id: int
QuestionText: str
Text: str
Type: str
model_computed_fields: ClassVar[Dict[str, ComputedFieldInfo]] = {}

A dictionary of computed field names and their corresponding ComputedFieldInfo objects.

model_config: ClassVar[ConfigDict] = {'extra': 'ignore', 'populate_by_name': True, 'str_max_length': 100000, 'str_min_length': 1, 'str_strip_whitespace': True, 'use_enum_values': True, 'validate_assignment': True, 'validate_default': True}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

model_fields: ClassVar[Dict[str, FieldInfo]] = {'Choices': FieldInfo(annotation=list[str], required=True), 'DataType': FieldInfo(annotation=str, required=True), 'Id': FieldInfo(annotation=int, required=True), 'QuestionText': FieldInfo(annotation=str, required=True, metadata=[MinLen(min_length=0)]), 'Text': FieldInfo(annotation=str, required=True, metadata=[MinLen(min_length=0)]), 'Type': FieldInfo(annotation=str, required=True)}

Metadata about the fields defined on the model, mapping of field names to [FieldInfo][pydantic.fields.FieldInfo] objects.

This replaces Model.__fields__ from Pydantic V1.

mex.extractors.voxco.settings module

class mex.extractors.voxco.settings.VoxcoSettings(*, mapping_path: AssetsPath = AssetsPath('mappings/__final__/voxco'))

Bases: BaseModel

Settings submodel for the Voxco extractor.

mapping_path: AssetsPath
model_computed_fields: ClassVar[Dict[str, ComputedFieldInfo]] = {}

A dictionary of computed field names and their corresponding ComputedFieldInfo objects.

model_config: ClassVar[ConfigDict] = {'extra': 'ignore', 'populate_by_name': True, 'str_max_length': 100000, 'str_min_length': 1, 'str_strip_whitespace': True, 'use_enum_values': True, 'validate_assignment': True, 'validate_default': True}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

model_fields: ClassVar[Dict[str, FieldInfo]] = {'mapping_path': FieldInfo(annotation=AssetsPath, required=False, default=AssetsPath("mappings/__final__/voxco"), description='Path to the directory with the voxco mapping files containing the default values, absolute path or relative to `assets_dir`.')}

Metadata about the fields defined on the model, mapping of field names to [FieldInfo][pydantic.fields.FieldInfo] objects.

This replaces Model.__fields__ from Pydantic V1.

mex.extractors.voxco.transform module

mex.extractors.voxco.transform.transform_voxco_resource_mappings_to_extracted_resources(voxco_resource_mappings: list[Any], organization_stable_target_id_by_query_voxco: dict[str, MergedOrganizationIdentifier], extracted_mex_persons_voxco: list[ExtractedPerson], unit_stable_target_ids_by_synonym: dict[str, MergedOrganizationalUnitIdentifier], extracted_organization_rki: ExtractedOrganization, extracted_primary_source_voxco: ExtractedPrimarySource, extracted_international_projects_activities: list[ExtractedActivity]) dict[str, ExtractedResource]

Transform voxco resource mappings to extracted resources.

Parameters:
  • voxco_resource_mappings – voxco resource mapping models

  • organization_stable_target_id_by_query_voxco – extracted voxco organizations dict

  • extracted_mex_persons_voxco – extracted voxco mex persons

  • unit_stable_target_ids_by_synonym – merged organizational units by name

  • extracted_organization_rki – extracted rki organization

  • extracted_primary_source_voxco – extracted voxco primary source

  • extracted_international_projects_activities – list of international projects

Returns:

dict extracted voxco resource by identifier in primary source

mex.extractors.voxco.transform.transform_voxco_variable_mappings_to_extracted_variables(extracted_voxco_resources: dict[str, ExtractedResource], voxco_variables: dict[str, list[VoxcoVariable]], extracted_primary_source_voxco: ExtractedPrimarySource) list[ExtractedVariable]

Transform voxco variable mappings to extracted variables.

Parameters:
  • extracted_voxco_resources – extracted voxco resources

  • voxco_variables – list of voxco variables by associated resource

  • extracted_primary_source_voxco – extracted voxco primary source

Returns:

list of extracted variables

Module contents