Schema and column types

The mapping of a Table's column names to data types is referred to as a schema. Each column has a unique name and a single data type, one of

  • float
  • integer
  • boolean
  • date
  • datetime
  • string

A Table schema is fixed at construction, either by explicitly passing a schema dictionary to the Client::table method, or by passing data to this method from which the schema is inferred (if CSV or JSON format) or inherited (if Arrow).

Type inference

When passing CSV or JSON data to the Client::table constructor, the type of each column is inferred automatically. In some cases, the inference algorithm may not return exactly what you'd like. For example, a column may be interpreted as a datetime when you intended it to be a string, or a column may have no values at all (yet), as it will be updated with values from a real-time data source later on. In these cases, create a table() with a schema.

Once the Table has been created, further Table::update calls will perform limited type coercion based on the schema. While coercion works similarly to inference, in that input data may be parsed based on the expected column type, Table::update will not change the column's type further. For example, a number literal 1234 would be inferred as an "integer", but in the context of an Table::update call on a known "string" column, this will be parsed as the string "1234".

date and datetime inference

Various string representations of date and datetime format columns can be inferred as well coerced from strings if they match one of Perspective's internal known datetime parsing formats, for example ISO 8601 (which is also the format Perspective will output these types for CSV).