CalistaEngine
- class calista.table.CalistaEngine(engine: str, config: Dict[str, Any] | None = None)
Bases:
objectFor now, you can execute data quality checks using the following engines or platforms: spark, pandas, polars, snowflake, bigquery.
- load(path: str | None = None, file_format: str | None = None, data: Dict[str, List] | None = None, table: str | None = None, schema: str | None = None, database: str | None = None, dataframe: Any | None = None, options: Dict[str, Any] | None = None) CalistaTable
Load data from a dataset into a
CalistaTable.- Parameters:
path ((str, optional)) – The path if you’re loading a file.
file_format ((str, optional)) – The format of the file (e.g., ‘csv’, ‘parquet’).
data ((dict)) – The dictionary containing the data of the table.
table ((str, optional)) – The name of the table if you’re not loading a file.
schema ((str, optional)) – The schema containing the table.
database ((str, optional)) – The database containing the table.
dataframe ((Any, optional)) – An existing dataframe.
options ((Dict[str, Any], optional)) – Additional configuration file options.
- Returns:
CalistaTable: The loaded table.- Raises:
Any exceptions raised by the engine’s read_dataset method.
- load_from_database(table: Any, schema: str | None = None, database: str | None = None) CalistaTable
Load data from a table into a
CalistaTable.- Parameters:
table ((str)) – The name of the table.
schema ((str, optional)) – The schema containing the table
database ((str, optional)) – The database containing the table.
- Returns:
CalistaTable: The loaded table.- Raises:
Any exceptions raised by the engine’s read_dataset method.
>>> from calista import CalistaEngine >>> >>> calista_table = CalistaEngine(engine="snowflake").load_from_database(database="my_database", >>> schema="my_schema", >>> table="my_table")
- load_from_dataframe(dataframe: Any) CalistaTable
Load data from a dataframe into a
CalistaTable.- Parameters:
dataframe ((Any)) – An existing dataframe.
- Returns:
CalistaTable: The loaded table.- Raises:
Any exceptions raised by the engine’s read_dataset method.
>>> import pandas as pd >>> from calista import CalistaEngine >>> >>> data = {'col_1': [3, 2, 1, 0], 'col_2': ['a', 'b', 'c', 'd']} >>> df = pd.DataFrame.from_dict(data) >>> calista_table = CalistaEngine(engine="pandas").load_from_dataframe(df) >>> calista_table.show()
>>> col_1 col_2 >>> 0 3 a >>> 1 2 b >>> 2 1 c >>> 3 0 d
- load_from_dict(data: Dict[str, List]) CalistaTable
Load data from a dictionary into a
CalistaTable.- Parameters:
data ((dict)) – The dictionary containing the data of the table.
- Returns:
CalistaTable: The loaded table.- Raises:
Any exceptions raised by the engine’s read_dataset method.
Example
>>> from calista import CalistaEngine >>> >>> calista_table = CalistaEngine(engine="spark").load_from_dict({"ID": [1, 2, 3, 4]}) >>> calista_table.show()
>>> +---+ >>> | ID| >>> +---+ >>> | 1| >>> | 2| >>> | 3| >>> | 4| >>> +---+
- load_from_path(path: str, file_format: str, options: Dict[str, Any] | None = None) CalistaTable
Load data from a path into a
CalistaTable.- Parameters:
path ((str)) – The path of the file containing your table.
file_format ((str, optional)) – The format of the file (e.g., ‘csv’, ‘parquet’).
- Returns:
CalistaTable: The loaded table.- Raises:
Any exceptions raised by the engine’s read_dataset method.
Example
>>> from calista import CalistaEngine >>> >>> csv_options = { >>> "delimiter": ",", >>> "header": "True" >>> } >>> calista_table = CalistaEngine(engine="spark").load_from_path(path='my_csv.csv',file_format="csv",options=csv_options)