mex.artificial package

Submodules

mex.artificial.constants module

mex.artificial.helpers module

mex.artificial.helpers.create_artificial_extracted_items(locale: str | Sequence[str] | dict[str, int | float] | None = ['de_DE', 'en_US'], seed: int | float | str | bytes | bytearray | None = 0, chattiness: int = 10, stem_types: Sequence[str] = ['AccessPlatform', 'Activity', 'BibliographicResource', 'ContactPoint', 'Distribution', 'Organization', 'OrganizationalUnit', 'Person', 'Resource', 'Variable', 'VariableGroup'], count: int = 100) list[ExtractedAccessPlatform | ExtractedActivity | ExtractedBibliographicResource | ExtractedConsent | ExtractedContactPoint | ExtractedDistribution | ExtractedOrganization | ExtractedOrganizationalUnit | ExtractedPerson | ExtractedPrimarySource | ExtractedResource | ExtractedVariable | ExtractedVariableGroup]

Create a list of extracted items for the given settings.

mex.artificial.helpers.create_artificial_items_and_rule_sets(locale: str | Sequence[str] | dict[str, int | float] | None = ['de_DE', 'en_US'], seed: int | float | str | bytes | bytearray | None = 0, chattiness: int = 10, stem_types: Sequence[str] = ['AccessPlatform', 'Activity', 'BibliographicResource', 'ContactPoint', 'Distribution', 'Organization', 'OrganizationalUnit', 'Person', 'Resource', 'Variable', 'VariableGroup'], count: int = 100) list[ExtractedItemAndRuleSet]

Create a list of artificial extracted items and rule-sets.

mex.artificial.helpers.create_artificial_merged_items(locale: str | Sequence[str] | dict[str, int | float] | None = ['de_DE', 'en_US'], seed: int | float | str | bytes | bytearray | None = 0, chattiness: int = 10, stem_types: Sequence[str] = ['AccessPlatform', 'Activity', 'BibliographicResource', 'ContactPoint', 'Distribution', 'Organization', 'OrganizationalUnit', 'Person', 'Resource', 'Variable', 'VariableGroup'], count: int = 100) list[MergedAccessPlatform | MergedActivity | MergedBibliographicResource | MergedConsent | MergedContactPoint | MergedDistribution | MergedOrganization | MergedOrganizationalUnit | MergedPerson | MergedPrimarySource | MergedResource | MergedVariable | MergedVariableGroup]

Create a list of merged items for the given settings.

mex.artificial.helpers.create_faker(locale: str | Sequence[str] | dict[str, int | float] | None = ['de_DE', 'en_US'], seed: int | float | str | bytes | bytearray | None = 0, chattiness: int = 10) Faker

Create and initialize a new faker instance with the given settings.

mex.artificial.helpers.generate_artificial_extracted_items(locale: str | Sequence[str] | dict[str, int | float] | None = ['de_DE', 'en_US'], seed: int | float | str | bytes | bytearray | None = 0, chattiness: int = 10, stem_types: Sequence[str] = ['AccessPlatform', 'Activity', 'BibliographicResource', 'ContactPoint', 'Distribution', 'Organization', 'OrganizationalUnit', 'Person', 'Resource', 'Variable', 'VariableGroup']) Generator[ExtractedAccessPlatform | ExtractedActivity | ExtractedBibliographicResource | ExtractedConsent | ExtractedContactPoint | ExtractedDistribution | ExtractedOrganization | ExtractedOrganizationalUnit | ExtractedPerson | ExtractedPrimarySource | ExtractedResource | ExtractedVariable | ExtractedVariableGroup, None, None]

Infinitely generate extracted items for the given settings.

mex.artificial.helpers.generate_artificial_items_and_rule_sets(locale: str | Sequence[str] | dict[str, int | float] | None = ['de_DE', 'en_US'], seed: int | float | str | bytes | bytearray | None = 0, chattiness: int = 10, stem_types: Sequence[str] = ['AccessPlatform', 'Activity', 'BibliographicResource', 'ContactPoint', 'Distribution', 'Organization', 'OrganizationalUnit', 'Person', 'Resource', 'Variable', 'VariableGroup']) Generator[ExtractedItemAndRuleSet, None, None]

Infinitely generate artificial extracted items and rule-sets.

mex.artificial.helpers.generate_artificial_merged_items(locale: str | Sequence[str] | dict[str, int | float] | None = ['de_DE', 'en_US'], seed: int | float | str | bytes | bytearray | None = 0, chattiness: int = 10, stem_types: Sequence[str] = ['AccessPlatform', 'Activity', 'BibliographicResource', 'ContactPoint', 'Distribution', 'Organization', 'OrganizationalUnit', 'Person', 'Resource', 'Variable', 'VariableGroup']) Generator[MergedAccessPlatform | MergedActivity | MergedBibliographicResource | MergedConsent | MergedContactPoint | MergedDistribution | MergedOrganization | MergedOrganizationalUnit | MergedPerson | MergedPrimarySource | MergedResource | MergedVariable | MergedVariableGroup, None, None]

Infinitely generate merged items for the given settings.

mex.artificial.helpers.write_merged_items(items: Iterable[MergedAccessPlatform | MergedActivity | MergedBibliographicResource | MergedConsent | MergedContactPoint | MergedDistribution | MergedOrganization | MergedOrganizationalUnit | MergedPerson | MergedPrimarySource | MergedResource | MergedVariable | MergedVariableGroup], count: int = 100, out_path: PathLike[str] | None = None) None

Write the desired number of items from the incoming stream to an NDJSON file.

mex.artificial.main module

mex.artificial.main.artificial(count: Annotated[int, <typer.models.OptionInfo object at 0x7fe21919fc50>]=100, chattiness: Annotated[int, <typer.models.OptionInfo object at 0x7fe21919fb10>]=10, seed: Annotated[int, <typer.models.OptionInfo object at 0x7fe21919fd90>]=0, locale: Annotated[list[str] | None, <typer.models.OptionInfo object at 0x7fe21919fed0>]=None, models: Annotated[list[str] | None, <typer.models.OptionInfo object at 0x7fe21973c050>]=None, path: Path | None, <typer.models.OptionInfo object at 0x7fe21973c190>]=None) None

Generate merged artificial items.

mex.artificial.main.main() None

Wrap entrypoint in typer.

mex.artificial.models module

class mex.artificial.models.ExtractedItemAndRuleSet(*, extracted_item: ExtractedAccessPlatform | ExtractedActivity | ExtractedBibliographicResource | ExtractedConsent | ExtractedContactPoint | ExtractedDistribution | ExtractedOrganization | ExtractedOrganizationalUnit | ExtractedPerson | ExtractedPrimarySource | ExtractedResource | ExtractedVariable | ExtractedVariableGroup | None = None, rule_set: AccessPlatformRuleSetResponse | ActivityRuleSetResponse | BibliographicResourceRuleSetResponse | ConsentRuleSetResponse | ContactPointRuleSetResponse | DistributionRuleSetResponse | OrganizationRuleSetResponse | OrganizationalUnitRuleSetResponse | PersonRuleSetResponse | PrimarySourceRuleSetResponse | ResourceRuleSetResponse | VariableRuleSetResponse | VariableGroupRuleSetResponse | None = None)

Bases: BaseModel

A combination of an extracted item and a rule-set (both optional).

extracted_item: ExtractedAccessPlatform | ExtractedActivity | ExtractedBibliographicResource | ExtractedConsent | ExtractedContactPoint | ExtractedDistribution | ExtractedOrganization | ExtractedOrganizationalUnit | ExtractedPerson | ExtractedPrimarySource | ExtractedResource | ExtractedVariable | ExtractedVariableGroup | None
model_config = {}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

rule_set: AccessPlatformRuleSetResponse | ActivityRuleSetResponse | BibliographicResourceRuleSetResponse | ConsentRuleSetResponse | ContactPointRuleSetResponse | DistributionRuleSetResponse | OrganizationRuleSetResponse | OrganizationalUnitRuleSetResponse | PersonRuleSetResponse | PrimarySourceRuleSetResponse | ResourceRuleSetResponse | VariableRuleSetResponse | VariableGroupRuleSetResponse | None
class mex.artificial.models.RandomFieldInfo(*, inner_type: Any, numerify_patterns: list[str] = [], regex_patterns: list[str] = [], examples: list[str | int | float | None | bool | dict[str, str | int | float | None | bool]] = [])

Bases: BaseModel

Randomized pick of matching inner type and patterns for a field.

examples: list[str | int | float | None | bool | dict[str, str | int | float | None | bool]]
inner_type: Any
model_config = {}

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

numerify_patterns: list[str]
regex_patterns: list[str]

mex.artificial.provider module

class mex.artificial.provider.BuilderProvider(generator: Any)

Bases: Provider

Faker provider that deals with interpreting pydantic model fields.

additive_rule(stem_type: str, ids_by_type: Mapping[str, Collection[MergedAccessPlatformIdentifier | MergedActivityIdentifier | MergedBibliographicResourceIdentifier | MergedConsentIdentifier | MergedContactPointIdentifier | MergedDistributionIdentifier | MergedOrganizationalUnitIdentifier | MergedOrganizationIdentifier | MergedPersonIdentifier | MergedPrimarySourceIdentifier | MergedResourceIdentifier | MergedVariableGroupIdentifier | MergedVariableIdentifier]], *, value_probability: float = 0.33) AdditiveAccessPlatform | AdditiveActivity | AdditiveBibliographicResource | AdditiveConsent | AdditiveContactPoint | AdditiveDistribution | AdditiveOrganization | AdditiveOrganizationalUnit | AdditivePerson | AdditivePrimarySource | AdditiveResource | AdditiveVariable | AdditiveVariableGroup

Generate an artificial additive rule.

extracted_item(stem_types: Sequence[str], ids_by_type: Mapping[str, Collection[MergedAccessPlatformIdentifier | MergedActivityIdentifier | MergedBibliographicResourceIdentifier | MergedConsentIdentifier | MergedContactPointIdentifier | MergedDistributionIdentifier | MergedOrganizationalUnitIdentifier | MergedOrganizationIdentifier | MergedPersonIdentifier | MergedPrimarySourceIdentifier | MergedResourceIdentifier | MergedVariableGroupIdentifier | MergedVariableIdentifier]], *, _attempts_left: int = 10) ExtractedAccessPlatform | ExtractedActivity | ExtractedBibliographicResource | ExtractedConsent | ExtractedContactPoint | ExtractedDistribution | ExtractedOrganization | ExtractedOrganizationalUnit | ExtractedPerson | ExtractedPrimarySource | ExtractedResource | ExtractedVariable | ExtractedVariableGroup

Generate a single extracted item from the given stem types.

field_value(field: FieldInfo, ids_by_type: Mapping[str, Collection[MergedAccessPlatformIdentifier | MergedActivityIdentifier | MergedBibliographicResourceIdentifier | MergedConsentIdentifier | MergedContactPointIdentifier | MergedDistributionIdentifier | MergedOrganizationalUnitIdentifier | MergedOrganizationIdentifier | MergedPersonIdentifier | MergedPrimarySourceIdentifier | MergedResourceIdentifier | MergedVariableGroupIdentifier | MergedVariableIdentifier]]) Sequence[Any]

Get a list of artificial values for the given field and identity.

field_value_factory(field_info: RandomFieldInfo, ids_by_type: Mapping[str, Collection[MergedAccessPlatformIdentifier | MergedActivityIdentifier | MergedBibliographicResourceIdentifier | MergedConsentIdentifier | MergedContactPointIdentifier | MergedDistributionIdentifier | MergedOrganizationalUnitIdentifier | MergedOrganizationIdentifier | MergedPersonIdentifier | MergedPrimarySourceIdentifier | MergedResourceIdentifier | MergedVariableGroupIdentifier | MergedVariableIdentifier]]) Callable[[], Any]

Get a factory for creating a single value for the given field.

get_random_field_info(field: FieldInfo) RandomFieldInfo

Randomly pick a matching type and patterns for a given field.

min_max_for_field(field: FieldInfo) tuple[int, int]

Return a min and max item count for a field.

preventive_rule(extracted_item: ExtractedAccessPlatform | ExtractedActivity | ExtractedBibliographicResource | ExtractedConsent | ExtractedContactPoint | ExtractedDistribution | ExtractedOrganization | ExtractedOrganizationalUnit | ExtractedPerson | ExtractedPrimarySource | ExtractedResource | ExtractedVariable | ExtractedVariableGroup, *, value_probability: float = 0.33) PreventiveAccessPlatform | PreventiveActivity | PreventiveBibliographicResource | PreventiveConsent | PreventiveContactPoint | PreventiveDistribution | PreventiveOrganization | PreventiveOrganizationalUnit | PreventivePerson | PreventivePrimarySource | PreventiveResource | PreventiveVariable | PreventiveVariableGroup

Generate an artificial preventive rule.

rule_set_for_item(extracted_item: ExtractedAccessPlatform | ExtractedActivity | ExtractedBibliographicResource | ExtractedConsent | ExtractedContactPoint | ExtractedDistribution | ExtractedOrganization | ExtractedOrganizationalUnit | ExtractedPerson | ExtractedPrimarySource | ExtractedResource | ExtractedVariable | ExtractedVariableGroup, ids_by_type: Mapping[str, Collection[MergedAccessPlatformIdentifier | MergedActivityIdentifier | MergedBibliographicResourceIdentifier | MergedConsentIdentifier | MergedContactPointIdentifier | MergedDistributionIdentifier | MergedOrganizationalUnitIdentifier | MergedOrganizationIdentifier | MergedPersonIdentifier | MergedPrimarySourceIdentifier | MergedResourceIdentifier | MergedVariableGroupIdentifier | MergedVariableIdentifier]], *, value_probability: float = 0.33, _attempts_left: int = 10) AccessPlatformRuleSetResponse | ActivityRuleSetResponse | BibliographicResourceRuleSetResponse | ConsentRuleSetResponse | ContactPointRuleSetResponse | DistributionRuleSetResponse | OrganizationRuleSetResponse | OrganizationalUnitRuleSetResponse | PersonRuleSetResponse | PrimarySourceRuleSetResponse | ResourceRuleSetResponse | VariableRuleSetResponse | VariableGroupRuleSetResponse

Generate a single rule-set for the given extracted item.

standalone_rule_set(stem_types: Sequence[str], ids_by_type: Mapping[str, Collection[MergedAccessPlatformIdentifier | MergedActivityIdentifier | MergedBibliographicResourceIdentifier | MergedConsentIdentifier | MergedContactPointIdentifier | MergedDistributionIdentifier | MergedOrganizationalUnitIdentifier | MergedOrganizationIdentifier | MergedPersonIdentifier | MergedPrimarySourceIdentifier | MergedResourceIdentifier | MergedVariableGroupIdentifier | MergedVariableIdentifier]], identifier_seed: int = 0, *, value_probability: float = 0.33, _attempts_left: int = 10) AccessPlatformRuleSetResponse | ActivityRuleSetResponse | BibliographicResourceRuleSetResponse | ConsentRuleSetResponse | ContactPointRuleSetResponse | DistributionRuleSetResponse | OrganizationRuleSetResponse | OrganizationalUnitRuleSetResponse | PersonRuleSetResponse | PrimarySourceRuleSetResponse | ResourceRuleSetResponse | VariableRuleSetResponse | VariableGroupRuleSetResponse

Generate a single standalone rule-set.

subtractive_rule(extracted_item: ExtractedAccessPlatform | ExtractedActivity | ExtractedBibliographicResource | ExtractedConsent | ExtractedContactPoint | ExtractedDistribution | ExtractedOrganization | ExtractedOrganizationalUnit | ExtractedPerson | ExtractedPrimarySource | ExtractedResource | ExtractedVariable | ExtractedVariableGroup, *, value_probability: float = 0.33) SubtractiveAccessPlatform | SubtractiveActivity | SubtractiveBibliographicResource | SubtractiveConsent | SubtractiveContactPoint | SubtractiveDistribution | SubtractiveOrganization | SubtractiveOrganizationalUnit | SubtractivePerson | SubtractivePrimarySource | SubtractiveResource | SubtractiveVariable | SubtractiveVariableGroup

Generate an artificial subtractive rule.

class mex.artificial.provider.LinkProvider(generator: Any)

Bases: Provider, Provider

Faker provider that can return links with optional title and language.

Return a link with optional title and language.

class mex.artificial.provider.NumerifyPatternsProvider(generator: Any)

Bases: Provider

Faker provider that tries to numerify a pattern until it matches a regex.

numerify_patterns(numerify_patterns: list[str], regex_patterns: list[str]) str | None

Try to numerify a pattern in 10 turns until it validates, or bail out.

class mex.artificial.provider.ReferenceProvider(generator: Any)

Bases: BaseProvider

Faker provider that creates references to other items.

reference(inner_type: type[Identifier], ids_by_type: Mapping[str, Collection[MergedAccessPlatformIdentifier | MergedActivityIdentifier | MergedBibliographicResourceIdentifier | MergedConsentIdentifier | MergedContactPointIdentifier | MergedDistributionIdentifier | MergedOrganizationalUnitIdentifier | MergedOrganizationIdentifier | MergedPersonIdentifier | MergedPrimarySourceIdentifier | MergedResourceIdentifier | MergedVariableGroupIdentifier | MergedVariableIdentifier]]) MergedAccessPlatformIdentifier | MergedActivityIdentifier | MergedBibliographicResourceIdentifier | MergedConsentIdentifier | MergedContactPointIdentifier | MergedDistributionIdentifier | MergedOrganizationalUnitIdentifier | MergedOrganizationIdentifier | MergedPersonIdentifier | MergedPrimarySourceIdentifier | MergedResourceIdentifier | MergedVariableGroupIdentifier | MergedVariableIdentifier | None

Return random merged item identifier picked from available mapping.

class mex.artificial.provider.TemporalEntityProvider(generator: Any)

Bases: Provider

Faker provider that can return a custom TemporalEntity with random precision.

temporal_entity(allowed_precision_levels: list[TemporalEntityPrecision]) TemporalEntity

Return a custom temporal entity with random date, time and precision.

class mex.artificial.provider.TextProvider(factory: Generator, chattiness: int)

Bases: Provider

Faker provider that handles custom text related requirements.

__init__(factory: Generator, chattiness: int) None

Configure the chattiness of generated text.

text_object() Text

Return a random text paragraph with an auto-detected language.

text_string() str

Return a randomized sequence of words as a string.

mex.artificial.types module

Module contents