CalistaTable
- class calista.table.CalistaTable(engine: LazyEngine)
Bases:
object- analyze(rule_name: str, rule: Condition) Metrics
Compute
Metricsbased on a condition.- Args:
rule_name (str): The name of the rule. rule (Condition): The Condition to evaluate.
- Returns:
Metrics: The metrics resulting from the analysis.- Raises:
Any exceptions raised by the analyze_rules method.
- analyze_rules(rules: Dict[str, Condition]) List[Metrics]
Compute
List[Metrics]based on rules.- Args:
rules (dict[RuleName, Condition]): The name of the rules and the conditions to execute.
- Returns:
List[Metrics]: The metrics resulting from the analysis.- Raises:
Any exceptions raised by the engine’s execute_conditions method.
- apply_rule(rule: Condition, rule_name: str | None = None) DataFrameType
Returns the dataset with new columns of booleans for given rule.
- Args:
rule (Condition): The Condition to execute. rule_name (str): Name of the rule (Default: None)
- Returns:
DataFrameType: The dataset with the new column resulting from the analysis.
- apply_rules(rules: Dict[str, Condition]) DataFrameType
Returns the dataset with new columns of booleans for each rules or the given condition.
- Args:
rules (Dict[RuleName, Condition]): The name of the rules and the conditions to execute.
- Returns:
DataFrameType: The dataset with new columns resulting from the analysis.
- filter(condition: Condition) CalistaTable
- get_invalid_rows(rule: Condition) DataFrameType
Returns the dataset filtered with the rows not validating the rules.
- Args:
rule (Condition): The Condition to evaluate.
- Returns:
DataFrameType: The dataset filtered with the rows where the rule is not satisfied.
- get_valid_rows(rule: Condition) DataFrameType
Returns the dataset filtered with the rows validating the rules.
- Args:
rule (Condition): The Condition to evaluate.
- Returns:
DataFrameType: The dataset filtered with the rows where the rule is satisfied.
- group_by(*cols: str) GroupedTable
Groups the
CalistaTableusing the specified columns, so we can execute aggregation conditions on them. SeeGroupedTablefor all the available functions after calling group_by.- Args:
cols (list, str):columns to group by. Each element should be a column name (string).
- property schema: dict[str, str]
Returns the schema of the underlying dataset.
- Returns:
Dict[ColumnName, PythonType]: Dict representing the schema of the underlying dataset.
- show(n: int = 10) None
Prints the first n rows to the console.
- Args:
n (int, optional): Number of rows to show
- where(condition: Condition) CalistaTable
Filters rows using the given condition.
filter()is an alias forwhere().- Args:
condition :
Condition- Returns:
CalistaTable: Filtered CalistaTable.