Skip to main content

Table

The Table is Perspective's columnar data frame, analogous to a Pandas DataFrame or Apache Arrow. Table supports appending data, in-place updates, removal by index, and notifications on update.

A Table contains columns, each of which have a unique name, are strongly and consistently typed, and contains rows of data conforming to the column's type. Each column in a Table must have the same number of rows, though not every row must contain data; null-values are used to indicate missing values in the dataset.

The columns of a Table are immutable after creation, which means their names and data types cannot be changed after the Table has been created. Columns cannot be added or deleted after creation, but a View can be used to select an arbitrary set of columns from the Table.

var data = [
{ x: 1, y: "a", z: true },
{ x: 2, y: "b", z: false },
{ x: 3, y: "c", z: true },
{ x: 4, y: "d", z: false },
];

const table1 = await worker.table(data);

Schema and Types

The mapping of a Table's column names to data types is referred to as a schema. Each column has a unique name and a single data type, and data types are expressed with a common vocabulary of across all supported host languages. In Python, you may alternatively use native types over their String counterparts:

var schema = {
x: "integer",
y: "string",
z: "boolean",
};

const table2 = await worker.table(schema);

When passing data directly to the table() constructor, the type of each column is inferred automatically. In some cases, the inference algorithm may not return exactly what you'd like. For example, a column may be interpreted as a datetime when you intended it to be a string, or a column may have no values at all (yet), as it will be updated with values from a real-time data source later on. In these cases, create a table() with a schema.

Once the Table has been created with a schema, further update() calls will convert data types to conform with the schema; a column that is typed as a datetime, for example, can be updated with date objects, datetime objects, pandas.Timestamp, numpy.datetime64, and even valid millisecond/seconds from epoch timestamps. Similarly, updating string columns with integer data will cause a cast to string, updating floats with ints will cast to float, and etc. Type conversion can also leverage Python converters, such as __int__, __float__, etc.

Index and Limit

Initializing a Table with an index tells Perspective to treat a column as the primary key, allowing in-place updates of rows. Only a single column (of any type) can be used as an index. Indexed Table instances allow:

  • In-place updates whenever a new row shares an index values with an existing row
  • Partial updates when such a row leaves some column values undefined
  • Removes to delete a row by index.

To create an indexed Table, provide the index property with a string column name to be used as an index:

const indexed_table = await perspective.table(data, { index: "a" });

Initializing a Table with a limit sets the total number of rows the Table is allowed to have. When the Table is updated, and the resulting size of the Table would exceed its limit, rows that exceed limit overwrite the oldest rows in the Table. To create a Table with a limit, provide the limit property with an integer indicating the maximum rows:

const limit_table = await perspective.table(data, { limit: 1000 });

limit cannot be used in conjunction with index.

Update and Remove

Once a Table has been created, it can be updated with new data conforming to the Table's schema. The dataset used for update() must conform with the formats supported by Perspective.

const schema = {
a: "integer",
b: "float",
};

const table = await perspective.table(schema);
table.update(new_data);

Without an index set, calls to update() append new data to the end of the Table. Otherwise, Perspective allows partial updates (in-place) using the index to determine which rows to update:

indexed_table.update({ id: [1, 4], name: ["x", "y"] });

Any value on a table() can be unset using the value null (Javascript) or None (Python). Values may be unset on construction, as any null in the dataset will be treated as an unset value, and can be explicitly unset via update() on a table() with index applied. update() calls do not need values for all columns in the table() schema; Missing keys (or keys with values set to undefined in Javascript), will be omitted from table()s with index set, and become null:

table.update([{ x: 3, y: null }]); // `z` missing

Rows can also be removed from an indexed Table, with an array of primary keys:

indexed_table.remove([1, 4]);

Calling clear() will remove all data from the underlying Table. Calling replace(data) with new data will clear the Table, and update it with a new dataset that conforms to Perspective's data types and the existing schema on the Table.

table.clear();
table.replace(json);