perspective

 1#  ┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓
 2#  ┃ ██████ ██████ ██████       █      █      █      █      █ █▄  ▀███ █       ┃
 3#  ┃ ▄▄▄▄▄█ █▄▄▄▄▄ ▄▄▄▄▄█  ▀▀▀▀▀█▀▀▀▀▀ █ ▀▀▀▀▀█ ████████▌▐███ ███▄  ▀█ █ ▀▀▀▀▀ ┃
 4#  ┃ █▀▀▀▀▀ █▀▀▀▀▀ █▀██▀▀ ▄▄▄▄▄ █ ▄▄▄▄▄█ ▄▄▄▄▄█ ████████▌▐███ █████▄   █ ▄▄▄▄▄ ┃
 5#  ┃ █      ██████ █  ▀█▄       █ ██████      █      ███▌▐███ ███████▄ █       ┃
 6#  ┣━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┫
 7#  ┃ Copyright (c) 2017, the Perspective Authors.                              ┃
 8#  ┃ ╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌ ┃
 9#  ┃ This file is part of the Perspective library, distributed under the terms ┃
10#  ┃ of the [Apache License 2.0](https://www.apache.org/licenses/LICENSE-2.0). ┃
11#  ┗━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┛
12
13__version__ = "3.7.2"
14__all__ = [
15    "_jupyter_labextension_paths",
16    "Server",
17    "Client",
18    "Table",
19    "View",
20    "PerspectiveError",
21    "ProxySession",
22    "AsyncClient",
23    "AsyncServer",
24    "num_cpus",
25    "set_num_cpus",
26]
27
28import functools
29
30from .perspective import (
31    Client,
32    PerspectiveError,
33    ProxySession,
34    Server,
35    AsyncServer,
36    AsyncClient,
37    # NOTE: these are classes without constructors,
38    # so we import them just for type hinting
39    Table,  # noqa: F401
40    View,  # noqa: F401
41    num_cpus,
42    set_num_cpus,
43)
44
45
46GLOBAL_SERVER = Server()
47GLOBAL_CLIENT = GLOBAL_SERVER.new_local_client()
48
49
50@functools.wraps(Client.table)
51def table(*args, **kwargs):
52    return GLOBAL_CLIENT.table(*args, **kwargs)
53
54
55@functools.wraps(Client.open_table)
56def open_table(*args, **kwargs):
57    return GLOBAL_CLIENT.table(*args, **kwargs)
58
59
60@functools.wraps(Client.get_hosted_table_names)
61def get_hosted_table_names(*args, **kwargs):
62    return GLOBAL_CLIENT.get_hosted_table_names(*args, **kwargs)
63
64
65def _jupyter_labextension_paths():
66    """
67    Read by `jupyter labextension develop`
68    @private
69    """
70    return [{"src": "labextension", "dest": "@finos/perspective-jupyterlab"}]
class Server:

An instance of a Perspective server. Each Server instance is separate, and does not share Table (or other) data with other Servers.

def new_local_client(self, /):

Create a new Client instance bound to this Server directly.

def new_session(self, /, response_cb):

Create a new Session bound to this Server.

Server.new_session only needs to be called if you've implemented a custom Perspective ['Client]/[Server`] transport.

def poll(self, /):

Flush pending updates to this Server, including notifications to View.on_update callbacks.

Server.poll only needs to be called if you've implemented a custom Perspective ['Client]/[Server] transport and provided the on_poll_request` constructor keyword argument.

class Client:

An instance of a Client is a connection to a single Server, whether locally in-memory or remote over some transport like a WebSocket.

Client and Perspective objects derived from it have _synchronous_ APIs, suitable for use in a repl or script context where this is the _only_ Client connected to its Server. If you want to integrate with a Web framework or otherwise connect multiple clients, use AsyncClient.

def from_server(server):

Create a new Client instance bound to a specific in-process Server (e.g. generally _not_ the global Server).

def handle_response(self, /, response):

Handle a message from the external message queue. Client.handle_response is part of the low-level message-handling API necessary to implement new transports for a Client connection to a local-or-remote Server, and doesn't generally need to be called directly by "users" of a Client once connected.

def table(self, /, input, limit=None, index=None, name=None, format=None):

Creates a new Table from either a _schema_ or _data_.

The Client.table factory function can be initialized with either a _schema_ (see Table.schema), or data in one of these formats:

  • Apache Arrow
  • CSV
  • JSON row-oriented
  • JSON column-oriented
  • NDJSON

When instantiated with _data_, the schema is inferred from this data. While this is convenient, inferrence is sometimes imperfect e.g. when the input is empty, null or ambiguous. For these cases, Client.table can first be instantiated with a explicit schema.

When instantiated with a _schema_, the resulting Table is empty but with known column names and column types. When subsqeuently populated with Table.update, these columns will be _coerced_ to the schema's type. This behavior can be useful when Client.table's column type inferences doesn't work.

The resulting Table is _virtual_, and invoking its methods dispatches events to the perspective_server::Server this Client connects to, where the data is stored and all calculation occurs.

Arguments

  • arg - Either _schema_ or initialization _data_.
  • options - Optional configuration which provides one of:
    • limit - The max number of rows the resulting Table can store.
    • index - The column name to use as an _index_ column. If this Table is being instantiated by _data_, this column name must be present in the data.
    • name - The name of the table. This will be generated if it is not provided.
    • format - The explicit format of the input data, can be one of "json", "columns", "csv" or "arrow". This overrides language-specific type dispatch behavior, which allows stringified and byte array alternative inputs.

Python Examples

Load a CSV from a str:

table = client.table("x,y\n1,2\n3,4")
def open_table(self, /, name):

Opens a Table that is hosted on the perspective_server::Server that is connected to this Client.

The name property of TableInitOptions is used to identify each Table. Table names can be looked up for each Client via Client.get_hosted_table_names.

Python Examples

table =  client.open_table("table_one");
def get_hosted_table_names(self, /):

Retrieves the names of all tables that this client has access to.

name is a string identifier unique to the Table (per Client), which can be used in conjunction with Client.open_table to get a Table instance without the use of Client.table constructor directly (e.g., one created by another Client).

Python Examples

tables = client.get_hosted_table_names();
def on_hosted_tables_update(self, /, callback):

Register a callback which is invoked whenever Client.table (on this Client) or Table.delete (on a Table belinging to this Client) are called.

def remove_hosted_tables_update(self, /, callback_id):

Remove a callback previously registered via Client.on_hosted_tables_update.

def terminate(self, /):

Terminates this Client, cleaning up any View handles the Client has open as well as its callbacks.

class Table:
def get_index(self, /):

Returns the name of the index column for the table.

Python Examples

table = perspective.table("x,y\n1,2\n3,4", index="x");
index = client.get_index()
def get_client(self, /):

Get a copy of the Client this Table came from.

def get_limit(self, /):

Returns the user-specified row limit for this table.

def get_name(self, /):

Returns the user-specified name for this table, or the auto-generated name if a name was not specified when the table was created.

def clear(self, /):

Removes all the rows in the Table, but preserves everything else including the schema, index, and any callbacks or registered View instances.

Calling Table.clear, like Table.update and Table.remove, will trigger an update event to any registered listeners via View.on_update.

def columns(self, /):

Returns the column names of this Table in "natural" order (the ordering implied by the input format).

# Python Examples

columns = table.columns()
def delete(self, /, lazy=False):

Delete this Table and cleans up associated resources.

Tables do not stop consuming resources or processing updates when they are garbage collected in their host language - you must call this method to reclaim these.

Arguments

  • options An options dictionary.
    • lazy Whether to delete this Table _lazily_. When false (the default), the delete will occur immediately, assuming it has no View instances registered to it (which must be deleted first, otherwise this method will throw an error). When true, the Table will only be marked for deltion once its View dependency count reaches 0.

Python Examples

table = client.table("x,y\n1,2\n3,4")

# ...

table.delete(lazy=True)
def make_port(self, /):

Create a unique channel ID on this Table, which allows View::on_update callback calls to be associated with the Table::update which caused them.

def on_delete(self, /, callback):

Register a callback which is called exactly once, when this Table is deleted with the Table.delete method.

Table.on_delete resolves when the subscription message is sent, not when the _delete_ event occurs.

def remove(self, /, input, format=None):
def remove_delete(self, /, callback_id):

Removes a listener with a given ID, as returned by a previous call to Table.on_delete.

def schema(self, /):

Returns a table's Schema, a mapping of column names to column types.

The mapping of a Table's column names to data types is referred to as a Schema. Each column has a unique name and a data type, one of:

  • "boolean" - A boolean type
  • "date" - A timesonze-agnostic date type (month/day/year)
  • "datetime" - A millisecond-precision datetime type in the UTC timezone
  • "float" - A 64 bit float
  • "integer" - A signed 32 bit integer (the integer type supported by JavaScript)
  • "string" - A String data type (encoded internally as a _dictionary_)

Note that all Table columns are _nullable_, regardless of the data type.

def validate_expressions(self, /, expression):

Validates the given expressions.

def view(self, /, **config):

Create a new View from this table with a specified ViewConfigUpdate.

See View struct.

Examples

view view = table.view(
    columns=["Sales"],
    aggregates={"Sales": "sum"},
    group_by=["Region", "State"],
)
def size(self, /):

Returns the number of rows in a Table.

def replace(self, /, input, format=None):

Removes all the rows in the Table, but preserves everything else including the schema, index, and any callbacks or registered View instances.

Calling Table.clear, like Table.update and Table.remove, will trigger an update event to any registered listeners via View.on_update.

def update(self, /, input, port_id=None, format=None):

Updates the rows of this table and any derived View instances.

Calling Table.update will trigger the View.on_update callbacks register to derived View, and the call itself will not resolve until _all_ derived View's are notified.

When updating a Table with an index, Table.update supports partial updates, by omitting columns from the update data.

Arguments

  • input - The input data for this Table. The schema of a Table is immutable after creation, so this method cannot be called with a schema.
  • options - Options for this update step - see perspective_client.UpdateOptions. ```
class View:

The View struct is Perspective's query and serialization interface. It represents a query on the Table's dataset and is always created from an existing Table instance via the Table.view method.

Views are immutable with respect to the arguments provided to the Table.view method; to change these parameters, you must create a new View on the same Table. However, each View is _live_ with respect to the Table's data, and will (within a conflation window) update with the latest state as its parent Table updates, including incrementally recalculating all aggregates, pivots, filters, etc. View query parameters are composable, in that each parameter works independently _and_ in conjunction with each other, and there is no limit to the number of pivots, filters, etc. which can be applied.

To construct a View, call the Table.view factory method. A Table can have as many Views associated with it as you need - Perspective conserves memory by relying on a single Table to power multiple Views concurrently.

def column_paths(self, /):

Returns an array of strings containing the column paths of the View without any of the source columns.

A column path shows the columns that a given cell belongs to after pivots are applied.

def to_columns_string(self, /, **window):

Renders this View as a column-oriented JSON string. Useful if you want to save additional round trip serialize/deserialize cycles.

def to_json_string(self, /, **window):

Renders this View as a row-oriented JSON string.

def to_ndjson(self, /, **window):

Renders this View as an NDJSON formatted String.

def to_records(self, /, **window):

Renders this View as a row-oriented Python list.

def to_json(self, /, **window):

Renders this View as a row-oriented Python list.

def to_columns(self, /, **window):

Renders this View as a column-oriented Python dict.

def to_csv(self, /, **window):

Renders this View as a CSV String in a standard format.

def to_dataframe(self, /, **window):

Renders this View as a pandas.DataFrame.

def to_pandas(self, /, **window):

Renders this View as a pandas.DataFrame.

def to_polars(self, /, **window):

Renders this View as a polars.DataFrame.

def to_arrow(self, /, **window):

Renders this View as the Apache Arrow data format.

Arguments

def delete(self, /):

Delete this View and clean up all resources associated with it. View objects do not stop consuming resources or processing updates when they are garbage collected - you must call this method to reclaim these.

def expand(self, /, index):
def collapse(self, /, index):
def dimensions(self, /):

Returns this View's _dimensions_, row and column count, as well as those of the crate.Table from which it was derived.

  • num_table_rows - The number of rows in the underlying crate.Table.
  • num_table_columns - The number of columns in the underlying crate.Table (including the index column if this crate.Table was constructed with one).
  • num_view_rows - The number of rows in this View. If this View has a group_by clause, num_view_rows will also include aggregated rows.
  • num_view_columns - The number of columns in this View. If this View has a split_by clause, num_view_columns will include all _column paths_, e.g. the number of columns clause times the number of split_by groups.
def expression_schema(self, /):

The expression schema of this View, which contains only the expressions created on this View. See View.schema for details.

def get_config(self, /):

A copy of the ViewConfig object passed to the Table.view method which created this View.

def get_min_max(self, /, column_name):

Calculates the [min, max] of the leaf nodes of a column column_name.

Returns

A tuple of [min, max], whose types are column and aggregate dependent.

def num_rows(self, /):

The number of aggregated rows in this View. This is affected by the "group_by" configuration parameter supplied to this view's contructor.

Returns

The number of aggregated rows.

def schema(self, /):

The schema of this View.

The View schema differs from the schema returned by Table.schema; it may have different column names due to expressions or columns configs, or it maye have _different column types_ due to the application og group_by and aggregates config. You can think of Table.schema as the _input_ schema and View.schema as the _output_ schema of a Perspective pipeline.

def on_delete(self, /, callback):

Register a callback with this View. Whenever the View is deleted, this callback will be invoked.

def remove_delete(self, /, callback_id):

Unregister a previously registered View.on_delete callback.

def on_update(self, /, callback, mode=None):

Register a callback with this View. Whenever the view's underlying table emits an update, this callback will be invoked with an object containing port_id, indicating which port the update fired on, and optionally delta, which is the new data that was updated for each cell or each row.

Arguments

  • on_update - A callback function invoked on update, which receives an object with two keys: port_id, indicating which port the update was triggered on, and delta, whose value is dependent on the mode parameter.
  • options - If this is provided as OnUpdateOptions { mode: Some(OnUpdateMode::Row) }, then delta is an Arrow of the updated rows. Otherwise delta will be Option.None.
def remove_update(self, /, callback_id):

Unregister a previously registered update callback with this View.

Arguments

  • id - A callback id as returned by a recipricol call to View.on_update.

Examples

let callback = |_| async { print!("Updated!") };
let cid = view.on_update(callback, OnUpdateOptions::default()).await?;
view.remove_update(cid).await?;
PerspectiveError = <class 'perspective.PyPerspectiveError'>
class ProxySession:
def handle_request(self, /, data):
def handle_request_async(self, /, data):
def close(self, /):
def num_cpus():

Returns the number of threads the internal threadpool will use.

def set_num_cpus(num_cpus):

Set the number of threads the internal threadpool will use. Can also be set with NUM_OMP_THREADS environment variable.