perspective
1# ┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓ 2# ┃ ██████ ██████ ██████ █ █ █ █ █ █▄ ▀███ █ ┃ 3# ┃ ▄▄▄▄▄█ █▄▄▄▄▄ ▄▄▄▄▄█ ▀▀▀▀▀█▀▀▀▀▀ █ ▀▀▀▀▀█ ████████▌▐███ ███▄ ▀█ █ ▀▀▀▀▀ ┃ 4# ┃ █▀▀▀▀▀ █▀▀▀▀▀ █▀██▀▀ ▄▄▄▄▄ █ ▄▄▄▄▄█ ▄▄▄▄▄█ ████████▌▐███ █████▄ █ ▄▄▄▄▄ ┃ 5# ┃ █ ██████ █ ▀█▄ █ ██████ █ ███▌▐███ ███████▄ █ ┃ 6# ┣━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┫ 7# ┃ Copyright (c) 2017, the Perspective Authors. ┃ 8# ┃ ╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌ ┃ 9# ┃ This file is part of the Perspective library, distributed under the terms ┃ 10# ┃ of the [Apache License 2.0](https://www.apache.org/licenses/LICENSE-2.0). ┃ 11# ┗━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┛ 12 13__version__ = "3.7.2" 14__all__ = [ 15 "_jupyter_labextension_paths", 16 "Server", 17 "Client", 18 "Table", 19 "View", 20 "PerspectiveError", 21 "ProxySession", 22 "AsyncClient", 23 "AsyncServer", 24 "num_cpus", 25 "set_num_cpus", 26] 27 28import functools 29 30from .perspective import ( 31 Client, 32 PerspectiveError, 33 ProxySession, 34 Server, 35 AsyncServer, 36 AsyncClient, 37 # NOTE: these are classes without constructors, 38 # so we import them just for type hinting 39 Table, # noqa: F401 40 View, # noqa: F401 41 num_cpus, 42 set_num_cpus, 43) 44 45 46GLOBAL_SERVER = Server() 47GLOBAL_CLIENT = GLOBAL_SERVER.new_local_client() 48 49 50@functools.wraps(Client.table) 51def table(*args, **kwargs): 52 return GLOBAL_CLIENT.table(*args, **kwargs) 53 54 55@functools.wraps(Client.open_table) 56def open_table(*args, **kwargs): 57 return GLOBAL_CLIENT.table(*args, **kwargs) 58 59 60@functools.wraps(Client.get_hosted_table_names) 61def get_hosted_table_names(*args, **kwargs): 62 return GLOBAL_CLIENT.get_hosted_table_names(*args, **kwargs) 63 64 65def _jupyter_labextension_paths(): 66 """ 67 Read by `jupyter labextension develop` 68 @private 69 """ 70 return [{"src": "labextension", "dest": "@finos/perspective-jupyterlab"}]
An instance of a Perspective server. Each Server instance is separate, and does not share Table (or other) data with other Servers.
Create a new Session
bound to this Server.
Server.new_session
only needs to be called if you've implemented
a custom Perspective ['Client]/[
Server`] transport.
Flush pending updates to this Server, including notifications to
View.on_update
callbacks.
Server.poll
only needs to be called if you've implemented
a custom Perspective ['Client]/[
Server] transport and provided the
on_poll_request` constructor keyword argument.
An instance of a Client is a connection to a single Server, whether locally in-memory or remote over some transport like a WebSocket.
Client and Perspective objects derived from it have _synchronous_ APIs,
suitable for use in a repl or script context where this is the _only_
Client connected to its Server. If you want to
integrate with a Web framework or otherwise connect multiple clients,
use AsyncClient
.
Handle a message from the external message queue.
Client.handle_response
is part of the low-level message-handling
API necessary to implement new transports for a Client
connection to a local-or-remote Server, and
doesn't generally need to be called directly by "users" of a
Client once connected.
Creates a new Table from either a _schema_ or _data_.
The Client.table
factory function can be initialized with either a
_schema_ (see Table.schema
), or data in one of these formats:
- Apache Arrow
- CSV
- JSON row-oriented
- JSON column-oriented
- NDJSON
When instantiated with _data_, the schema is inferred from this data.
While this is convenient, inferrence is sometimes imperfect e.g.
when the input is empty, null or ambiguous. For these cases,
Client.table
can first be instantiated with a explicit schema.
When instantiated with a _schema_, the resulting Table is empty but
with known column names and column types. When subsqeuently
populated with Table.update
, these columns will be _coerced_ to
the schema's type. This behavior can be useful when
Client.table
's column type inferences doesn't work.
The resulting Table is _virtual_, and invoking its methods
dispatches events to the perspective_server::Server
this
Client connects to, where the data is stored and all calculation
occurs.
Arguments
arg
- Either _schema_ or initialization _data_.options
- Optional configuration which provides one of:limit
- The max number of rows the resulting Table can store.index
- The column name to use as an _index_ column. If thisTable
is being instantiated by _data_, this column name must be present in the data.name
- The name of the table. This will be generated if it is not provided.format
- The explicit format of the input data, can be one of"json"
,"columns"
,"csv"
or"arrow"
. This overrides language-specific type dispatch behavior, which allows stringified and byte array alternative inputs.
Python Examples
Load a CSV from a str
:
table = client.table("x,y\n1,2\n3,4")
Opens a Table that is hosted on the perspective_server::Server
that is connected to this Client.
The name
property of TableInitOptions
is used to identify each
Table. Table name
s can be looked up for each Client
via Client.get_hosted_table_names
.
Python Examples
table = client.open_table("table_one");
Retrieves the names of all tables that this client has access to.
name
is a string identifier unique to the Table (per Client),
which can be used in conjunction with Client.open_table
to get
a Table instance without the use of Client.table
constructor directly (e.g., one created by another Client).
Python Examples
tables = client.get_hosted_table_names();
Register a callback which is invoked whenever Client.table
(on this
Client) or Table.delete
(on a Table belinging to this
Client) are called.
Remove a callback previously registered via
Client.on_hosted_tables_update
.
Returns the name of the index column for the table.
Python Examples
table = perspective.table("x,y\n1,2\n3,4", index="x");
index = client.get_index()
Returns the user-specified name for this table, or the auto-generated name if a name was not specified when the table was created.
Removes all the rows in the Table, but preserves everything else including the schema, index, and any callbacks or registered View instances.
Calling Table.clear
, like Table.update
and Table.remove
,
will trigger an update event to any registered listeners via
View.on_update
.
Returns the column names of this Table in "natural" order (the ordering implied by the input format).
# Python Examples
columns = table.columns()
Delete this Table and cleans up associated resources.
Tables do not stop consuming resources or processing updates when they are garbage collected in their host language - you must call this method to reclaim these.
Arguments
options
An options dictionary.lazy
Whether to delete this Table _lazily_. When false (the default), the delete will occur immediately, assuming it has no View instances registered to it (which must be deleted first, otherwise this method will throw an error). When true, the Table will only be marked for deltion once its View dependency count reaches 0.
Python Examples
table = client.table("x,y\n1,2\n3,4")
# ...
table.delete(lazy=True)
Create a unique channel ID on this Table, which allows
View::on_update
callback calls to be associated with the
Table::update
which caused them.
Register a callback which is called exactly once, when this Table is
deleted with the Table.delete
method.
Table.on_delete
resolves when the subscription message is sent, not
when the _delete_ event occurs.
Removes a listener with a given ID, as returned by a previous call to
Table.on_delete
.
Returns a table's Schema
, a mapping of column names to column types.
The mapping of a Table's column names to data types is referred to
as a Schema
. Each column has a unique name and a data type, one
of:
"boolean"
- A boolean type"date"
- A timesonze-agnostic date type (month/day/year)"datetime"
- A millisecond-precision datetime type in the UTC timezone"float"
- A 64 bit float"integer"
- A signed 32 bit integer (the integer type supported by JavaScript)"string"
- AString
data type (encoded internally as a _dictionary_)
Note that all Table columns are _nullable_, regardless of the data type.
Create a new View from this table with a specified
ViewConfigUpdate
.
See View struct.
Examples
view view = table.view(
columns=["Sales"],
aggregates={"Sales": "sum"},
group_by=["Region", "State"],
)
Removes all the rows in the Table, but preserves everything else including the schema, index, and any callbacks or registered View instances.
Calling Table.clear
, like Table.update
and Table.remove
,
will trigger an update event to any registered listeners via
View.on_update
.
Updates the rows of this table and any derived View instances.
Calling Table.update
will trigger the View.on_update
callbacks
register to derived View, and the call itself will not resolve until
_all_ derived View's are notified.
When updating a Table with an index
, Table.update
supports
partial updates, by omitting columns from the update data.
Arguments
input
- The input data for this Table. The schema of a Table is immutable after creation, so this method cannot be called with a schema.options
- Options for this update step - seeperspective_client.UpdateOptions
. ```
The View struct is Perspective's query and serialization interface. It
represents a query on the Table
's dataset and is always created from an
existing Table
instance via the Table.view
method.
Views are immutable with respect to the arguments provided to the
Table.view
method; to change these parameters, you must create a new
View on the same Table. However, each View is _live_ with
respect to the Table's data, and will (within a conflation window)
update with the latest state as its parent Table updates, including
incrementally recalculating all aggregates, pivots, filters, etc. View
query parameters are composable, in that each parameter works independently
_and_ in conjunction with each other, and there is no limit to the number of
pivots, filters, etc. which can be applied.
To construct a View, call the Table.view
factory method. A
Table can have as many Views associated with it as you need -
Perspective conserves memory by relying on a single Table to power
multiple Views concurrently.
Returns an array of strings containing the column paths of the View without any of the source columns.
A column path shows the columns that a given cell belongs to after pivots are applied.
Renders this View as a column-oriented JSON string. Useful if you want to save additional round trip serialize/deserialize cycles.
Returns this View's _dimensions_, row and column count, as well as
those of the crate.Table
from which it was derived.
num_table_rows
- The number of rows in the underlyingcrate.Table
.num_table_columns
- The number of columns in the underlyingcrate.Table
(including theindex
column if thiscrate.Table
was constructed with one).num_view_rows
- The number of rows in this View. If this View has agroup_by
clause,num_view_rows
will also include aggregated rows.num_view_columns
- The number of columns in this View. If this View has asplit_by
clause,num_view_columns
will include all _column paths_, e.g. the number ofcolumns
clause times the number ofsplit_by
groups.
The expression schema of this View, which contains only the
expressions created on this View. See View.schema
for
details.
A copy of the ViewConfig
object passed to the Table.view
method
which created this View.
Calculates the [min, max] of the leaf nodes of a column column_name
.
Returns
A tuple of [min, max], whose types are column and aggregate dependent.
The number of aggregated rows in this View. This is affected by the "group_by" configuration parameter supplied to this view's contructor.
Returns
The number of aggregated rows.
The schema of this View.
The View schema differs from the schema
returned by
Table.schema
; it may have different column names due to
expressions
or columns
configs, or it maye have _different
column types_ due to the application og group_by
and aggregates
config. You can think of Table.schema
as the _input_ schema and
View.schema
as the _output_ schema of a Perspective pipeline.
Unregister a previously registered View.on_delete
callback.
Register a callback with this View. Whenever the view's underlying
table emits an update, this callback will be invoked with an object
containing port_id
, indicating which port the update fired on, and
optionally delta
, which is the new data that was updated for each
cell or each row.
Arguments
on_update
- A callback function invoked on update, which receives an object with two keys:port_id
, indicating which port the update was triggered on, anddelta
, whose value is dependent on the mode parameter.options
- If this is provided asOnUpdateOptions { mode: Some(OnUpdateMode::Row) }
, thendelta
is an Arrow of the updated rows. Otherwisedelta
will beOption.None
.
Unregister a previously registered update callback with this View.
Arguments
id
- A callbackid
as returned by a recipricol call toView.on_update
.
Examples
let callback = |_| async { print!("Updated!") };
let cid = view.on_update(callback, OnUpdateOptions::default()).await?;
view.remove_update(cid).await?;
Returns the number of threads the internal threadpool will use.
Set the number of threads the internal threadpool will use. Can also be set
with NUM_OMP_THREADS
environment variable.