CalistaTable

class calista.table.CalistaTable(engine: LazyEngine)

Bases: object

analyze(rule_name: str, rule: Condition) Metrics

Compute Metrics based on a condition.

Args:

rule_name (str): The name of the rule. rule (Condition): The Condition to evaluate.

Returns:

Metrics: The metrics resulting from the analysis.

Raises:

Any exceptions raised by the analyze_rules method.

analyze_rules(rules: Dict[str, Condition]) List[Metrics]

Compute List[Metrics] based on rules.

Args:

rules (dict[RuleName, Condition]): The name of the rules and the conditions to execute.

Returns:

List[Metrics]: The metrics resulting from the analysis.

Raises:

Any exceptions raised by the engine’s execute_conditions method.

apply_rule(rule: Condition, rule_name: str | None = None) DataFrameType

Returns the dataset with new columns of booleans for given rule.

Args:

rule (Condition): The Condition to execute. rule_name (str): Name of the rule (Default: None)

Returns:

DataFrameType: The dataset with the new column resulting from the analysis.

apply_rules(rules: Dict[str, Condition]) DataFrameType

Returns the dataset with new columns of booleans for each rules or the given condition.

Args:

rules (Dict[RuleName, Condition]): The name of the rules and the conditions to execute.

Returns:

DataFrameType: The dataset with new columns resulting from the analysis.

filter(condition: Condition) CalistaTable
get_invalid_rows(rule: Condition) DataFrameType

Returns the dataset filtered with the rows not validating the rules.

Args:

rule (Condition): The Condition to evaluate.

Returns:

DataFrameType: The dataset filtered with the rows where the rule is not satisfied.

get_valid_rows(rule: Condition) DataFrameType

Returns the dataset filtered with the rows validating the rules.

Args:

rule (Condition): The Condition to evaluate.

Returns:

DataFrameType: The dataset filtered with the rows where the rule is satisfied.

group_by(*cols: str) GroupedTable

Groups the CalistaTable using the specified columns, so we can execute aggregation conditions on them. See GroupedTable for all the available functions after calling group_by.

Args:

cols (list, str):columns to group by. Each element should be a column name (string).

property schema: dict[str, str]

Returns the schema of the underlying dataset.

Returns:

Dict[ColumnName, PythonType]: Dict representing the schema of the underlying dataset.

show(n: int = 10) None

Prints the first n rows to the console.

Args:

n (int, optional): Number of rows to show

where(condition: Condition) CalistaTable

Filters rows using the given condition.

filter() is an alias for where().

Args:

condition : Condition

Returns:

CalistaTable: Filtered CalistaTable.