mex.extractors.pipeline.checks package

Subpackages

Submodules

mex.extractors.pipeline.checks.main module

mex.extractors.pipeline.checks.main.check_historical_rule(rule_name: str, current_number_of_extracted_items: int, historic_count: int, rule: dict[str, Any]) bool

Check rules that compare current state with historical data.

Parameters:
  • rule_name – Name of the historical rule.

  • current_number_of_extracted_items – Current count of extracted items.

  • historic_count – Historical count for comparison.

  • rule – Rule configuration from YAML.

Returns True if check passes, False if check fails.

mex.extractors.pipeline.checks.main.check_item_count_rule(context: AssetCheckExecutionContext, rule_name: str, asset_key: AssetKey, extractor: str, entity_type: str) bool

Checks extracted items are complying to given rule and threshold.

Parameters:
  • context – The Dagster asset execution context for this check.

  • rule_name – Name of the rule to check.

  • asset_key – Dagster AssetKey object.

  • extractor – Name of the extractor that produced the asset.

  • entity_type – Entity Type for the asset check.

Returns True if check passes, raises ValueError if check fails.

mex.extractors.pipeline.checks.main.check_static_rule(rule_name: str, current_number_of_extracted_items: int, rule: dict[str, Any]) bool

Check rules that validate current state (no historical data needed).

Parameters:
  • rule_name – Name of the static rule.

  • current_number_of_extracted_items – Current count of extracted items.

  • rule – Rule configuration from YAML.

Returns False if check fails, True if check passes.

mex.extractors.pipeline.checks.main.get_historic_count(historic_events: dict[datetime, int], time_frame: datetime) int

Get count for closest timestamp <= time_frame or next closest > time_frame.

mex.extractors.pipeline.checks.main.get_historical_events(events: list[EventLogRecord]) dict[datetime, int]

Load all past events and refactor it to a dict.

mex.extractors.pipeline.checks.main.get_rule(rule: str, extractor: str, entity_type: str) dict[str, Any]

Load rule model from YAML file for given rule type.

mex.extractors.pipeline.checks.main.load_asset_check_from_settings(extractor: str, entity_type: str) AssetCheck

Load AssetCheck model from YAML for a given extractor and entity type.

mex.extractors.pipeline.checks.main.parse_time_frame(time_frame: str) timedelta

Parse time frame string into timedelta.

Module contents