Loading data
A Table
may also be created-or-updated by data in CSV,
Apache Arrow, JSON row-oriented or JSON
column-oriented formats. In addition to these, perspective-python
additionally
supports pyarrow.Table
, polars.DataFrame
and pandas.DataFrame
objects
directly. These formats are otherwise identical to the built-in formats and
don't exhibit any additional support or type-awareness; e.g., pandas.DataFrame
support is just pyarrow.Table.from_pandas
piped into Perspective's Arrow
reader.
Client::table
and Table::update
perform coercion on their input for all
input formats except Arrow (which comes with its own schema and has no need
for coercion). "date"
and "datetime"
column types do not have native JSON
representations, so these column types cannot be inferred from JSON input.
Instead, for columns of these types for JSON input, a Table
must first be
constructed with a schema. Next, call Table::update
with the JSON input -
Perspective's JSON reader may coerce a date
or datetime
from these native
JSON types:
integer
as milliseconds-since-epoch.string
as a any of Perspective's built-in date format formats.- JavaScript
Date
and Pythondatetime.date
anddatetime.datetime
are not supported directly. However, in JavaScriptDate
types are automatically coerced to correctinteger
timestamps by default when converted to JSON.
Apache Arrow
The most efficient way to load data into Perspective, encoded as Apache Arrow IPC format. In JavaScript:
const resp = await fetch(
"https://cdn.jsdelivr.net/npm/superstore-arrow/superstore.lz4.arrow"
);
const arrow = await resp.arrayBuffer();
Apache Arrow input do not support type coercion, preferring Arrow's internal self-describing schema.
CSV
Perspective relies on Apache Arrow's CSV parser, and as such uses mostly the same column-type inference logic as Arrow itself would use for parsing CSV.
Row Oriented JSON
Row-oriented JSON is in the form of a list of objects. Each object in the list corresponds to a row in the table. For example:
[
{ "a": 86, "b": false, "c": "words" },
{ "a": 0, "b": true, "c": "" },
{ "a": 12345, "b": false, "c": "here" }
]
Column Oriented JSON
Column-Oriented JSON comes in the form of an object of lists. Each key of the object is a column name, and each element of the list is the corresponding value in the row.
{
"a": [86, 0, 12345],
"b": [false, true, false],
"c": ["words", "", "here"]
}
NDJSON
NDJSON is a format.
{ "a": 86, "b": false, "c": "words" }
{ "a": 0, "b": true, "c": "" }
{ "a": 12345, "b": false, "c": "here" }