mex.extractors.open_data package¶
Subpackages¶
- mex.extractors.open_data.models package
 
Submodules¶
mex.extractors.open_data.connector module¶
- class mex.extractors.open_data.connector.OpenDataConnector¶
 Bases:
HTTPConnectorConnector class to handle requesting the Zenodo API.
- _send_request(method: str, url: str, params: Mapping[str, list[str] | str | None] | None, **kwargs: Any) Response¶
 Overwrite HTTPConnector._send_request with more waiting time.
- _set_url() None¶
 Set url of the host.
- get_files_for_resource_version(version_id: int) list[OpenDataVersionFiles]¶
 Load files for each version of a resource by querying the Zenodo API.
- Parameters:
 version_id – id of a resource version
- Returns:
 Zenodo resource version files
- get_oldest_resource_version_creation_date(resource_id: int) str | None¶
 Load oldest (first) version of a resource by querying the Zenodo API.
- Parameters:
 resource_id – id of any resource version
- Returns:
 Zenodo resource version (oldest)
- get_parent_resources() list[OpenDataParentResource]¶
 Load parent resources by querying the Zenodo API.
Gets the parent resources (~ latest version) of all the resources of the configured Zenodo community.
- Returns:
 list of parent resources
mex.extractors.open_data.extract module¶
- mex.extractors.open_data.extract.extract_files_for_parent_resource(version_id: int) list[OpenDataVersionFiles]¶
 Fetch all files of a version resource.
- Parameters:
 version_id – id of record version as integer
- Returns:
 OpenDataVersionFiles
- mex.extractors.open_data.extract.extract_oldest_record_version_creationdate(record_id: int) str | None¶
 Fetch only the oldest version of a parent resource.
- Parameters:
 record_id – id of record version as integer
- Returns:
 OpenDataResourceVersion
- mex.extractors.open_data.extract.extract_open_data_persons_from_open_data_parent_resources(open_data_parent_resource: list[OpenDataParentResource]) list[OpenDataCreatorsOrContributors]¶
 Extract unique open Data persons from open data parent resources.
- Parameters:
 open_data_parent_resource – open data parent resource
- Returns:
 list of extracted open data persons (creators or contributors)
- mex.extractors.open_data.extract.extract_parent_resources() list[OpenDataParentResource]¶
 Load Open Data resources by querying the Zenodo API.
Get all resources of the configured Zenodo community. These are called ‘parent resources’.
- Returns:
 list of parent resources
mex.extractors.open_data.main module¶
mex.extractors.open_data.settings module¶
- class mex.extractors.open_data.settings.OpenDataSettings(*, url: str = 'https://zenodo', community_rki: str = 'robertkochinstitut', mapping_path: AssetsPath = AssetsPath('mappings/open-data'))¶
 Bases:
BaseModelZenodo settings submodel definition for the Open Data extractor.
- community_rki: str¶
 
- mapping_path: AssetsPath¶
 
- model_config: ClassVar[ConfigDict] = {'extra': 'ignore', 'populate_by_name': True, 'str_max_length': 100000, 'str_min_length': 1, 'str_strip_whitespace': True, 'use_enum_values': True, 'validate_assignment': True, 'validate_by_alias': True, 'validate_by_name': True, 'validate_default': True}¶
 Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- url: str¶
 
mex.extractors.open_data.transform module¶
- mex.extractors.open_data.transform.lookup_person_in_ldap_and_transform(person: OpenDataCreatorsOrContributors, units_by_identifier_in_primary_source: dict[str, ExtractedOrganizationalUnit], extracted_organization_rki: ExtractedOrganization) ExtractedPerson | None¶
 Lookup person in ldap. and transform to ExtractedPerson.
- Parameters:
 person – Open Data person (Creator Or Contributor),
units_by_identifier_in_primary_source – dict of primary sources by ID
extracted_organization_rki – ExtractedOrganization of RKI,
- Returns:
 ExtractedPerson if matched or None if match fails
- mex.extractors.open_data.transform.transform_open_data_distributions(open_data_parent_resources: list[OpenDataParentResource], distribution_mapping: DistributionMapping) list[ExtractedDistribution]¶
 Transform open data resource versions to extracted distributions.
- Parameters:
 open_data_parent_resources – list of open data parent resources
distribution_mapping – resource mapping model with default values
- Returns:
 List of ExtractedDistribution instances
- mex.extractors.open_data.transform.transform_open_data_parent_resource_to_mex_resource(open_data_parent_resource: list[OpenDataParentResource], open_data_persons: list[ExtractedPerson], unit_stable_target_ids_by_synonym: dict[str, MergedOrganizationalUnitIdentifier], open_data_distribution: list[ExtractedDistribution], resource_mapping: ResourceMapping, extracted_organization_rki: ExtractedOrganization, open_data_extracted_contact_points: list[ExtractedContactPoint]) list[ExtractedResource]¶
 Transform open_data parent resources to extracted resources.
- Parameters:
 open_data_parent_resource – open data parent resources
open_data_persons – list of ExtractedPerson
unit_stable_target_ids_by_synonym – Unit stable target ids by synonym
open_data_distribution – list of Extracted open data Distributions
resource_mapping – resource mapping model with default values
extracted_organization_rki – ExtractedOrganization
open_data_extracted_contact_points – list[ExtractedContactPoint]
- Returns:
 list of ExtractedResource instances
- mex.extractors.open_data.transform.transform_open_data_person_affiliations_to_organizations(open_data_creators_contributors: list[OpenDataCreatorsOrContributors]) dict[str, MergedOrganizationIdentifier]¶
 Search wikidata or create own organizations, load to sink and create dictionary.
- Parameters:
 open_data_creators_contributors – list of creators and contributors
- Returns:
 list of Extracted Organization Ids by affiliation name
- mex.extractors.open_data.transform.transform_open_data_persons(open_data_creators_contributors: list[OpenDataCreatorsOrContributors], extracted_organizational_units: list[ExtractedOrganizationalUnit], extracted_organization_rki: ExtractedOrganization, open_data_organization_ids_by_str: dict[str, MergedOrganizationIdentifier]) list[ExtractedPerson]¶
 Lookup persons in ldap or create ExtractedPerson if match fails.
- Parameters:
 open_data_creators_contributors – list of Creators Or Contributors
extracted_organizational_units – list of Extracted Organizational Units
extracted_organization_rki – ExtractedOrganization of RKI,
open_data_organization_ids_by_str – dictionary with ID by affiliation name
- Returns:
 list of Extracted Persons
- mex.extractors.open_data.transform.transform_open_data_persons_not_in_ldap(person: OpenDataCreatorsOrContributors, extracted_organization_rki: ExtractedOrganization, open_data_organization_ids_by_str: dict[str, MergedOrganizationIdentifier]) ExtractedPerson¶
 Create ExtractedPerson for a person not matched with ldap.
- Parameters:
 person – list[OpenDataCreatorsOrContributors],
extracted_organization_rki – ExtractedOrganization of RKI,
open_data_organization_ids_by_str – dictionary with ID by affiliation name
- Returns:
 ExtractedPerson