kaskada.table

Module Contents

Classes

TableSource

PulsarTableSource

Functions

get_table_name(table)

Gets the table name from either the table protobuf, create table response, get table response, or a string

list_tables([search, client])

Lists all tables the user has access to

get_table(table[, client])

Gets a table by name

create_table(table_name, time_column_name, ...[, ...])

Creates a table

delete_table(table[, client, force])

Deletes a table referenced by name

load(table_name, file[, client])

Loads a local file to a table. The type of file is inferred from the extension.

load_dataframe(table_name, dataframe[, client, engine])

Loads a dataframe to a table.

Attributes

logger

logger[source]
class TableSource[source]

Bases: object

class PulsarTableSource(broker_service_url, admin_service_url, auth_plugin, auth_params, tenant, namespace, topic_name)[source]

Bases: TableSource

Parameters:
  • broker_service_url (str) –

  • admin_service_url (str) –

  • auth_plugin (str) –

  • auth_params (str) –

  • tenant (str) –

  • namespace (str) –

  • topic_name (str) –

get_table_name(table)[source]

Gets the table name from either the table protobuf, create table response, get table response, or a string

Parameters:

table (Union[table_pb.Table, table_pb.CreateTableResponse, table_pb.GetTableResponse, str]) – The target table object

Returns:

The table name (None if unable to match)

Return type:

str

list_tables(search=None, client=None)[source]

Lists all tables the user has access to

Parameters:
  • search (str, optional) – The search parameter to filter list by. Defaults to None.

  • client (Client, optional) – The Kaskada Client. Defaults to kaskada.KASKADA_DEFAULT_CLIENT.

Returns:

Response from the API

Return type:

table_pb.ListTablesResponse

get_table(table, client=None)[source]

Gets a table by name

Parameters:
  • table (Union[Table, CreateTableResponse, GetTableResponse, str]) – The target table object

  • client (Client, optional) – The Kaskada Client. Defaults to kaskada.KASKADA_DEFAULT_CLIENT.

Returns:

Response from the API

Return type:

table_pb.GetTableResponse

create_table(table_name, time_column_name, entity_key_column_name, subsort_column_name=None, grouping_id=None, source=None, client=None)[source]

Creates a table

Parameters:
  • table_name (str) – The name of the table

  • time_column_name (str) – The time column

  • entity_key_column_name (str) – The entity key column

  • subsort_column_name (str, optional) – The subsort column. Defaults to None and Kaskada will generate a subsort column for the data.

  • grouping_id (str, optional) – The grouping id. Defaults to None.

  • source (TableSource, optional) – A configurable table source. Defaults to None.

  • client (Client, optional) – The Kaskada Client. Defaults to kaskada.KASKADA_DEFAULT_CLIENT.

Returns:

Response from the API

Return type:

table_pb.CreateTableResponse

delete_table(table, client=None, force=False)[source]

Deletes a table referenced by name

Parameters:
  • table (Union[Table, CreateTableResponse, GetTableResponse, str]) – The target table object

  • client (Client, optional) – The Kaskada Client. Defaults to kaskada.KASKADA_DEFAULT_CLIENT.

  • force (bool) –

Returns:

Response from the API

Return type:

table_pb.DeleteTableResponse

load(table_name, file, client=None)[source]

Loads a local file to a table. The type of file is inferred from the extension.

Parameters:
  • table_name (str) – The name of the target table

  • file (str) – The path to a local file (absolute or relative), or a S3 / GCS / Azure Blob URI

  • client (Optional[Client], optional) – The Kaskada Client. Defaults to None.

Returns:

Response from API

Return type:

table_pb.LoadDataResponse

load_dataframe(table_name, dataframe, client=None, engine='pyarrow')[source]

Loads a dataframe to a table.

This converts the dataframe to a Parquet file first, and then loads that file. If your dataframe was loaded from a Parquet file (or other supported format), it would be better to load that directly with load()

Parameters:
  • table_name (str) – The name of the target table

  • dataframe (pd.DataFrame) – The target dataframe to load

  • client (Optional[Client], optional) – The Kaskada Client. Defaults to None.

  • engine (str, optional) – The engine to convert the dataframe to parquet. Defaults to ‘pyarrow’.

Returns:

Response from the API.

Return type:

table_pb.LoadDataResponse