pyarrow.Schema¶
-
class
pyarrow.
Schema
¶ Bases:
pyarrow.lib._Weakrefable
-
__init__
(*args, **kwargs)¶ Initialize self. See help(type(self)) for accurate signature.
Methods
__init__
(*args, **kwargs)Initialize self.
add_metadata
(self, metadata)append
(self, Field field)Append a field at the end of the schema.
empty_table
(self)Provide an empty table according to the schema.
equals
(self, Schema other, …)Test if this schema is equal to the other
field
(self, i)Select a field by its column name or numeric index.
field_by_name
(self, name)Access a field by its name rather than the column index.
from_pandas
(type cls, df[, preserve_index])Returns implied schema from dataframe
get_all_field_indices
(self, name)Return sorted list of indices for fields with the given name
get_field_index
(self, name)Return index of field with given unique name.
insert
(self, int i, Field field)Add a field at position i to the schema.
remove
(self, int i)Remove the field at index i from the schema.
remove_metadata
(self)Create new schema without metadata, if any
serialize
(self[, memory_pool])Write Schema to Buffer as encapsulated IPC message
set
(self, int i, Field field)Replace a field at position i in the schema.
to_string
(self[, truncate_metadata, …])Return human-readable representation of Schema
with_metadata
(self, metadata)Add metadata as dict of string keys and values to Schema
Attributes
The schema’s field names.
Return deserialized-from-JSON pandas metadata field (if it exists)
The schema’s field types.
-
add_metadata
(self, metadata)¶
-
append
(self, Field field)¶ Append a field at the end of the schema.
In contrast to Python’s
list.append()
it does return a new object, leaving the original Schema unmodified.- Parameters
field (Field) –
- Returns
schema (Schema) – New object with appended field.
-
empty_table
(self)¶ Provide an empty table according to the schema.
- Returns
table (pyarrow.Table)
-
equals
(self, Schema other, bool check_metadata=False)¶ Test if this schema is equal to the other
- Parameters
other (pyarrow.Schema) –
check_metadata (bool, default False) – Key/value metadata must be equal too
- Returns
is_equal (bool)
-
field
(self, i)¶ Select a field by its column name or numeric index.
- Parameters
i (int or string) –
- Returns
pyarrow.Field
-
field_by_name
(self, name)¶ Access a field by its name rather than the column index.
- Parameters
name (str) –
- Returns
field (pyarrow.Field)
-
from_pandas
(type cls, df, preserve_index=None)¶ Returns implied schema from dataframe
- Parameters
df (pandas.DataFrame) –
preserve_index (bool, default True) – Whether to store the index as an additional column (or columns, for MultiIndex) in the resulting Table. The default of None will store the index as a column, except for RangeIndex which is stored as metadata only. Use
preserve_index=True
to force it to be stored as a column.
- Returns
pyarrow.Schema
Examples
>>> import pandas as pd >>> import pyarrow as pa >>> df = pd.DataFrame({ ... 'int': [1, 2], ... 'str': ['a', 'b'] ... }) >>> pa.Schema.from_pandas(df) int: int64 str: string __index_level_0__: int64
-
get_all_field_indices
(self, name)¶ Return sorted list of indices for fields with the given name
-
get_field_index
(self, name)¶ Return index of field with given unique name. Returns -1 if not found or if duplicated
-
insert
(self, int i, Field field)¶ Add a field at position i to the schema.
- Parameters
i (int) –
field (Field) –
- Returns
schema (Schema)
-
metadata
¶
-
names
¶ The schema’s field names.
- Returns
list of str
-
pandas_metadata
¶ Return deserialized-from-JSON pandas metadata field (if it exists)
-
remove
(self, int i)¶ Remove the field at index i from the schema.
- Parameters
i (int) –
- Returns
schema (Schema)
-
remove_metadata
(self)¶ Create new schema without metadata, if any
- Returns
schema (pyarrow.Schema)
-
serialize
(self, memory_pool=None)¶ Write Schema to Buffer as encapsulated IPC message
- Parameters
memory_pool (MemoryPool, default None) – Uses default memory pool if not specified
- Returns
serialized (Buffer)
-
set
(self, int i, Field field)¶ Replace a field at position i in the schema.
- Parameters
i (int) –
field (Field) –
- Returns
schema (Schema)
-
to_string
(self, truncate_metadata=True, show_field_metadata=True, show_schema_metadata=True)¶ Return human-readable representation of Schema
- Parameters
truncate_metadata (boolean, default True) – Limit metadata key/value display to a single line of ~80 characters or less
show_field_metadata (boolean, default True) – Display Field-level KeyValueMetadata
show_schema_metadata (boolean, default True) – Display Schema-level KeyValueMetadata
- Returns
str (the formatted output)
-
types
¶ The schema’s field types.
- Returns
list of DataType
-
with_metadata
(self, metadata)¶ Add metadata as dict of string keys and values to Schema
- Parameters
metadata (dict) – Keys and values must be string-like / coercible to bytes
- Returns
schema (pyarrow.Schema)
-