mex.extractors.publisher package¶

Submodules¶

mex.extractors.publisher.extract module¶

mex.extractors.publisher.extract.get_publishable_merged_items(*, entity_type: list[str] | None = None, referenced_identifier: list[str] | None = None, reference_field: str | None = None) → list[MergedAccessPlatform | MergedActivity | MergedBibliographicResource | MergedConsent | MergedContactPoint | MergedDistribution | MergedOrganization | MergedOrganizationalUnit | MergedPerson | MergedPrimarySource | MergedResource | MergedVariable | MergedVariableGroup]¶: Read publishable merged items from backend.

mex.extractors.publisher.fields module¶

mex.extractors.publisher.filter module¶

mex.extractors.publisher.filter.filter_persons_with_consent(person_items: list[MergedPerson], consent_items: list[MergedConsent]) → list[MergedPerson]¶

Filter person items for having consent.

Parameters:

person_items – list of persons
consent_items – list of consents

Returns:

list of filtered persons without consent.

mex.extractors.publisher.main module¶

mex.extractors.publisher.settings module¶

class mex.extractors.publisher.settings.PublisherSettings(*, skip_entity_types: list[str] = ['MergedPrimarySource', 'MergedConsent'], allowed_person_primary_sources: list[str] = ['endnote'])¶

Bases: BaseModel

Settings submodel definition for the publishing pipeline.

allowed_person_primary_sources: list[str]¶

model_config: ClassVar[ConfigDict] = {'extra': 'ignore', 'populate_by_name': True, 'str_max_length': 100000, 'str_min_length': 1, 'str_strip_whitespace': True, 'use_enum_values': True, 'validate_assignment': True, 'validate_by_alias': True, 'validate_by_name': True, 'validate_default': True}¶: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

skip_entity_types: list[str]¶

mex.extractors.publisher.transform module¶

For each Person get their unit IDs if the unit has an email address.

Parameters:

publisher_merged_persons – Merged Persons with primary source ldap
publisher_contact_points_and_units – Items container of units + contact points

Returns:

dictionary of unit identifiers by person identifier

Update references to actors, where needed.

We filter all fields that allow Person references to only contain references to publishable actors. For fields that also allow organizational units, non-consenting persons can get replaced by their organizational unit if the unit provides an email address. Fields that allow contact points, but contain no valid references are set to a fallback contact point. Should the field be required, not allow contact points, but still contain no valid references, we keep the broken ones in order to keep mex-model compliance. Would we skip those items instead, we might break other items relying on the former item, and start a recursive de-publication process - which we don’t want.

mex.extractors.publisher package¶

Submodules¶

mex.extractors.publisher.extract module¶

mex.extractors.publisher.fields module¶

mex.extractors.publisher.filter module¶

mex.extractors.publisher.main module¶

mex.extractors.publisher.settings module¶

mex.extractors.publisher.transform module¶

Module contents¶

mex-extractors

Navigation

Related Topics