mex.extractors.international_projects package¶
Subpackages¶
- mex.extractors.international_projects.models package
- Submodules
 - mex.extractors.international_projects.models.source module
InternationalProjectsSourceInternationalProjectsSource.activity1InternationalProjectsSource.activity2InternationalProjectsSource.additional_rki_unitsInternationalProjectsSource.end_dateInternationalProjectsSource.full_project_nameInternationalProjectsSource.funding_programInternationalProjectsSource.funding_sourceInternationalProjectsSource.funding_typeInternationalProjectsSource.get_end_year()InternationalProjectsSource.get_funding_sources()InternationalProjectsSource.get_identifier_in_primary_source()InternationalProjectsSource.get_partners()InternationalProjectsSource.get_project_lead_persons()InternationalProjectsSource.get_project_lead_rki_units()InternationalProjectsSource.get_start_year()InternationalProjectsSource.get_units()InternationalProjectsSource.model_configInternationalProjectsSource.partner_organizationInternationalProjectsSource.project_abbreviationInternationalProjectsSource.project_lead_personInternationalProjectsSource.project_lead_rki_unitInternationalProjectsSource.rki_internal_project_numberInternationalProjectsSource.start_dateInternationalProjectsSource.topic1InternationalProjectsSource.topic2InternationalProjectsSource.website
 - Module contents
 
 
Submodules¶
mex.extractors.international_projects.extract module¶
- mex.extractors.international_projects.extract.extract_international_projects_funding_sources(international_projects_sources: Iterable[InternationalProjectsSource]) dict[str, MergedOrganizationIdentifier]¶
 Search and extract funding organization from wikidata.
- Parameters:
 international_projects_sources – Iterable of international-project sources
- Returns:
 Dict with organization label and WikidataOrganization
- mex.extractors.international_projects.extract.extract_international_projects_partner_organizations(international_projects_sources: Iterable[InternationalProjectsSource]) dict[str, MergedOrganizationIdentifier]¶
 Search and extract partner organization from wikidata.
- Parameters:
 international_projects_sources – Iterable of international-project sources
- Returns:
 Dict with organization label and WikidataOrganization
- mex.extractors.international_projects.extract.extract_international_projects_project_leaders(international_projects_sources: Iterable[InternationalProjectsSource]) Generator[LDAPPersonWithQuery, None, None]¶
 Extract LDAP persons with their query string for project leaders.
- Parameters:
 international_projects_sources – international projects sources
- Returns:
 Generator for LDAP persons with query
- mex.extractors.international_projects.extract.extract_international_projects_source(row: pd.Series[Any]) InternationalProjectsSource | None¶
 Extract one international projects source from an xlrd row.
- Parameters:
 row – xlrd row representing one source
column_indices – indices by column names
- Returns:
 international projects source, or None
- mex.extractors.international_projects.extract.extract_international_projects_sources() list[InternationalProjectsSource]¶
 Extract international projects sources by loading data from MS-Excel file.
- Returns:
 list for international projects sources
- mex.extractors.international_projects.extract.get_clean_organizations_names(organizations_str: str) list[str]¶
 Get clean names for partner organizations.
- Parameters:
 organizations_str (str) – string containing all organizations names
- Returns:
 list of clean organizations names
- mex.extractors.international_projects.extract.get_temporal_entity_from_cell(cell_value: Any) TemporalEntity | YearMonthDay | None¶
 Try to extract a temporal_entity from a cell.
- Parameters:
 cell_value – Value of a cell, could be int, string or datetime
- Returns:
 TemporalEntity or None
mex.extractors.international_projects.main module¶
mex.extractors.international_projects.settings module¶
- class mex.extractors.international_projects.settings.InternationalProjectsSettings(*, file_path: AssetsPath = AssetsPath('raw-data/international-projects/international_projects.xlsx'), mapping_path: AssetsPath = AssetsPath('mappings/international-projects'))¶
 Bases:
BaseModelSettings submodel definition for the international projects extractor.
- file_path: AssetsPath¶
 
- mapping_path: AssetsPath¶
 
- model_config: ClassVar[ConfigDict] = {'extra': 'ignore', 'populate_by_name': True, 'str_max_length': 100000, 'str_min_length': 1, 'str_strip_whitespace': True, 'use_enum_values': True, 'validate_assignment': True, 'validate_by_alias': True, 'validate_by_name': True, 'validate_default': True}¶
 Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
mex.extractors.international_projects.transform module¶
- mex.extractors.international_projects.transform.get_or_create_partner_organization(partner_organization: list[str], extracted_organizations: dict[str, MergedOrganizationIdentifier]) list[MergedOrganizationIdentifier]¶
 Get partner organizations merged ids.
- Parameters:
 partner_organization – partner organizations from the source
extracted_organizations – merged organization identifier extracted from wikidata
- Returns:
 list of matched or created merged organization identifier
- mex.extractors.international_projects.transform.get_theme_for_activity_or_topic(theme: list[MappingField[list[Theme]]], activity1: str | None, activity2: str | None, topic1: str | None, topic2: str | None) list[Theme]¶
 Get theme identifier for activities and topics.
- Parameters:
 theme – theme extracted from mapping
activity1 – activity 1 from the international-projects raw data file
activity2 – activity 2 from the international-projects raw data file
topic1 – topic 1 from the international-projects raw data file
topic2 – topic 2 from the international-projects raw data file
- Returns:
 Sorted list of Theme
- mex.extractors.international_projects.transform.transform_international_projects_source_to_extracted_activity(source: InternationalProjectsSource, international_projects_activity: ActivityMapping, person_stable_target_ids_by_query_string: dict[str, list[MergedPersonIdentifier]], unit_stable_target_id_by_synonym: dict[str, MergedOrganizationalUnitIdentifier], funding_sources_stable_target_id_by_query: dict[str, MergedOrganizationIdentifier], partner_organizations_stable_target_id_by_query: dict[str, MergedOrganizationIdentifier]) ExtractedActivity | None¶
 Transform international projects source to extracted activity.
- Parameters:
 source – international projects sources
international_projects_activity – activity mapping model with default values
person_stable_target_ids_by_query_string – Mapping from author query to person stable target ID
unit_stable_target_id_by_synonym – Mapping from unit acronyms and labels to unit stable target ID
funding_sources_stable_target_id_by_query – Mapping from funding sources to organization stable target ID
partner_organizations_stable_target_id_by_query – Mapping from partner orgs to their stable target ID
- Returns:
 ExtractedActivity or None if it was filtered out
- mex.extractors.international_projects.transform.transform_international_projects_sources_to_extracted_activities(international_projects_sources: Iterable[InternationalProjectsSource], international_projects_activity: ActivityMapping, person_stable_target_ids_by_query_string: dict[str, list[MergedPersonIdentifier]], unit_stable_target_id_by_synonym: dict[str, MergedOrganizationalUnitIdentifier], funding_sources_stable_target_id_by_query: dict[str, MergedOrganizationIdentifier], partner_organizations_stable_target_id_by_query: dict[str, MergedOrganizationIdentifier]) Generator[ExtractedActivity, None, None]¶
 Transform international projects sources to extracted activity.
- Parameters:
 international_projects_sources – international projects sources
international_projects_activity – activity mapping model with default values
person_stable_target_ids_by_query_string – Mapping from author query to person stable target ID
unit_stable_target_id_by_synonym – Mapping from unit acronyms and labels to unit stable target ID
funding_sources_stable_target_id_by_query – Mapping from funding sources to organization stable target ID
partner_organizations_stable_target_id_by_query – Mapping from partner orgs to their stable target ID
- Returns:
 Generator for ExtractedActivity instances