Data Types¶
- 
enum arrow::Type::type¶
- Main data type enumeration. - This enumeration provides a quick way to interrogate the category of a DataType instance. - Values: - 
enumerator NA¶
- A NULL type having no physical storage. 
 - 
enumerator BOOL¶
- Boolean as 1 bit, LSB bit-packed ordering. 
 - 
enumerator UINT8¶
- Unsigned 8-bit little-endian integer. 
 - 
enumerator INT8¶
- Signed 8-bit little-endian integer. 
 - 
enumerator UINT16¶
- Unsigned 16-bit little-endian integer. 
 - 
enumerator INT16¶
- Signed 16-bit little-endian integer. 
 - 
enumerator UINT32¶
- Unsigned 32-bit little-endian integer. 
 - 
enumerator INT32¶
- Signed 32-bit little-endian integer. 
 - 
enumerator UINT64¶
- Unsigned 64-bit little-endian integer. 
 - 
enumerator INT64¶
- Signed 64-bit little-endian integer. 
 - 
enumerator HALF_FLOAT¶
- 2-byte floating point value 
 - 
enumerator FLOAT¶
- 4-byte floating point value 
 - 
enumerator DOUBLE¶
- 8-byte floating point value 
 - 
enumerator STRING¶
- UTF8 variable-length string as List<Char> 
 - 
enumerator BINARY¶
- Variable-length bytes (no guarantee of UTF8-ness) 
 - 
enumerator FIXED_SIZE_BINARY¶
- Fixed-size binary. Each value occupies the same number of bytes. 
 - 
enumerator DATE32¶
- int32_t days since the UNIX epoch 
 - 
enumerator DATE64¶
- int64_t milliseconds since the UNIX epoch 
 - 
enumerator TIMESTAMP¶
- Exact timestamp encoded with int64 since UNIX epoch Default unit millisecond. 
 - 
enumerator TIME32¶
- Time as signed 32-bit integer, representing either seconds or milliseconds since midnight. 
 - 
enumerator TIME64¶
- Time as signed 64-bit integer, representing either microseconds or nanoseconds since midnight. 
 - 
enumerator INTERVAL_MONTHS¶
- YEAR_MONTH interval in SQL style. 
 - 
enumerator INTERVAL_DAY_TIME¶
- DAY_TIME interval in SQL style. 
 - 
enumerator DECIMAL128¶
- Precision- and scale-based decimal type with 128 bits. 
 - 
enumerator DECIMAL¶
- Defined for backward-compatibility. 
 - 
enumerator DECIMAL256¶
- Precision- and scale-based decimal type with 256 bits. 
 - 
enumerator LIST¶
- A list of some logical data type. 
 - 
enumerator STRUCT¶
- Struct of logical types. 
 - 
enumerator SPARSE_UNION¶
- Sparse unions of logical types. 
 - 
enumerator DENSE_UNION¶
- Dense unions of logical types. 
 - 
enumerator DICTIONARY¶
- Dictionary-encoded type, also called “categorical” or “factor” in other programming languages. - Holds the dictionary value type but not the dictionary itself, which is part of the ArrayData struct 
 - 
enumerator MAP¶
- Map, a repeated struct logical type. 
 - 
enumerator EXTENSION¶
- Custom data type, implemented by user. 
 - 
enumerator FIXED_SIZE_LIST¶
- Fixed size list of some logical type. 
 - 
enumerator DURATION¶
- Measure of elapsed time in either seconds, milliseconds, microseconds or nanoseconds. 
 - 
enumerator LARGE_STRING¶
- Like STRING, but with 64-bit offsets. 
 - 
enumerator LARGE_BINARY¶
- Like BINARY, but with 64-bit offsets. 
 - 
enumerator LARGE_LIST¶
- Like LIST, but with 64-bit offsets. 
 - 
enumerator MAX_ID¶
 
- 
enumerator 
- 
class arrow::DataType: public arrow::detail::Fingerprintable¶
- Base class for all data types. - Data types in this library are all logical. They can be expressed as either a primitive physical type (bytes or bits of some fixed size), a nested type consisting of other data types, or another data type (e.g. a timestamp encoded as an int64). - Simple datatypes may be entirely described by their Type::type id, but complex datatypes are usually parametric. - Subclassed by arrow::BaseBinaryType, arrow::ExtensionType, arrow::FixedWidthType, arrow::NestedType, arrow::NullType - Public Functions - 
bool Equals(const DataType &other, bool check_metadata = false) const¶
- Return whether the types are equal. - Types that are logically convertible from one to another (e.g. List<UInt8> and Binary) are NOT equal. 
 - Return whether the types are equal. 
 - 
const std::vector<std::shared_ptr<Field>> &fields() const¶
- Returns the children fields associated with this type. 
 - 
int num_fields() const¶
- Returns the number of children fields associated with this type. 
 - 
std::string ToString() const = 0¶
- A string representation of the type, including any children. 
 - 
size_t Hash() const¶
- Return hash value (excluding metadata in child fields) 
 - 
std::string name() const = 0¶
- A string name of the type, omitting any child fields. - Note
- Experimental API 
- Since
- 0.7.0 
 
 - 
DataTypeLayout layout() const = 0¶
- Return the data type layout. - Children are not included. - Note
- Experimental API 
 
 
- 
bool 
Factory functions¶
These functions are recommended for creating data types. They may return new objects or existing singletons, depending on the type requested.
- 
std::shared_ptr<DataType> boolean()¶
- Return a BooleanType instance. 
- 
std::shared_ptr<DataType> uint16()¶
- Return a UInt16Type instance. 
- 
std::shared_ptr<DataType> uint32()¶
- Return a UInt32Type instance. 
- 
std::shared_ptr<DataType> uint64()¶
- Return a UInt64Type instance. 
- 
std::shared_ptr<DataType> float16()¶
- Return a HalfFloatType instance. 
- 
std::shared_ptr<DataType> float64()¶
- Return a DoubleType instance. 
- 
std::shared_ptr<DataType> utf8()¶
- Return a StringType instance. 
- 
std::shared_ptr<DataType> large_utf8()¶
- Return a LargeStringType instance. 
- 
std::shared_ptr<DataType> binary()¶
- Return a BinaryType instance. 
- 
std::shared_ptr<DataType> large_binary()¶
- Return a LargeBinaryType instance. 
- 
std::shared_ptr<DataType> date32()¶
- Return a Date32Type instance. 
- 
std::shared_ptr<DataType> date64()¶
- Return a Date64Type instance. 
- 
std::shared_ptr<DataType> fixed_size_binary(int32_t byte_width)¶
- Create a FixedSizeBinaryType instance. 
- 
std::shared_ptr<DataType> decimal(int32_t precision, int32_t scale)¶
- Create a Decimal128Type or Decimal256Type instance depending on the precision. 
- 
std::shared_ptr<DataType> decimal128(int32_t precision, int32_t scale)¶
- Create a Decimal128Type instance. 
- 
std::shared_ptr<DataType> decimal256(int32_t precision, int32_t scale)¶
- Create a Decimal256Type instance. 
- Create a LargeListType instance from its child Field type. 
- Create a LargeListType instance from its child DataType. 
- Create a MapType instance from its key and value DataTypes. 
- Create a MapType instance from its key DataType and value field. - The field override is provided to communicate nullability of the value. 
- Create a FixedSizeListType instance from its child Field type. 
- Create a FixedSizeListType instance from its child DataType. 
- 
std::shared_ptr<DataType> duration(TimeUnit::type unit)¶
- Return a Duration instance (naming use _type to avoid namespace conflict with built in time classes). 
- 
std::shared_ptr<DataType> day_time_interval()¶
- Return a DayTimeIntervalType instance. 
- 
std::shared_ptr<DataType> month_interval()¶
- Return a MonthIntervalType instance. 
- 
std::shared_ptr<DataType> timestamp(TimeUnit::type unit)¶
- Create a TimestampType instance from its unit. 
- 
std::shared_ptr<DataType> timestamp(TimeUnit::type unit, const std::string &timezone)¶
- Create a TimestampType instance from its unit and timezone. 
- 
std::shared_ptr<DataType> time32(TimeUnit::type unit)¶
- Create a 32-bit time type instance. - Unit can be either SECOND or MILLI 
- 
std::shared_ptr<DataType> time64(TimeUnit::type unit)¶
- Create a 64-bit time type instance. - Unit can be either MICRO or NANO 
- Create a StructType instance. 
- 
std::shared_ptr<DataType> sparse_union(FieldVector child_fields, std::vector<int8_t> type_codes = {})¶
- Create a SparseUnionType instance. 
- 
std::shared_ptr<DataType> dense_union(FieldVector child_fields, std::vector<int8_t> type_codes = {})¶
- Create a DenseUnionType instance. 
- 
std::shared_ptr<DataType> sparse_union(const ArrayVector &children, std::vector<std::string> field_names = {}, std::vector<int8_t> type_codes = {})¶
- Create a SparseUnionType instance. 
- 
std::shared_ptr<DataType> dense_union(const ArrayVector &children, std::vector<std::string> field_names = {}, std::vector<int8_t> type_codes = {})¶
- Create a DenseUnionType instance. 
- Create a UnionType instance. 
- Create a UnionType instance. 
- Create a UnionType instance. 
- Create a UnionType instance. 
- Create a UnionType instance. 
- Create a DictionaryType instance. - Parameters
- [in] index_type: the type of the dictionary indices (must be a signed integer)
- [in] dict_type: the type of the values in the variable dictionary
- [in] ordered: true if the order of the dictionary values has semantic meaning and should be preserved where possible
 
 
Concrete type subclasses¶
Primitive¶
- 
class arrow::NullType: public arrow::DataType¶
- Concrete type class for always-null data. - Public Functions - 
std::string ToString() const override¶
- A string representation of the type, including any children. 
 - 
DataTypeLayout layout() const override¶
- Return the data type layout. - Children are not included. - Note
- Experimental API 
 
 - 
std::string name() const override¶
- A string name of the type, omitting any child fields. - Note
- Experimental API 
- Since
- 0.7.0 
 
 
- 
std::string 
- 
class arrow::BooleanType: public arrow::detail::CTypeImpl<BooleanType, PrimitiveCType, Type::BOOL, bool>¶
- Concrete type class for boolean data. - Public Functions - 
DataTypeLayout layout() const override¶
- Return the data type layout. - Children are not included. - Note
- Experimental API 
 
 
- 
DataTypeLayout 
- 
class Int8Type: public arrow::detail::IntegerTypeImpl<Int8Type, Type::INT8, int8_t>¶
- Concrete type class for signed 8-bit integer data. 
- 
class Int16Type: public arrow::detail::IntegerTypeImpl<Int16Type, Type::INT16, int16_t>¶
- Concrete type class for signed 16-bit integer data. 
- 
class Int32Type: public arrow::detail::IntegerTypeImpl<Int32Type, Type::INT32, int32_t>¶
- Concrete type class for signed 32-bit integer data. 
- 
class Int64Type: public arrow::detail::IntegerTypeImpl<Int64Type, Type::INT64, int64_t>¶
- Concrete type class for signed 64-bit integer data. 
- 
class UInt8Type: public arrow::detail::IntegerTypeImpl<UInt8Type, Type::UINT8, uint8_t>¶
- Concrete type class for unsigned 8-bit integer data. 
- 
class UInt16Type: public arrow::detail::IntegerTypeImpl<UInt16Type, Type::UINT16, uint16_t>¶
- Concrete type class for unsigned 16-bit integer data. 
- 
class UInt32Type: public arrow::detail::IntegerTypeImpl<UInt32Type, Type::UINT32, uint32_t>¶
- Concrete type class for unsigned 32-bit integer data. 
- 
class UInt64Type: public arrow::detail::IntegerTypeImpl<UInt64Type, Type::UINT64, uint64_t>¶
- Concrete type class for unsigned 64-bit integer data. 
- 
class HalfFloatType: public arrow::detail::CTypeImpl<HalfFloatType, FloatingPointType, Type::HALF_FLOAT, uint16_t>¶
- Concrete type class for 16-bit floating-point data. 
- 
class FloatType: public arrow::detail::CTypeImpl<FloatType, FloatingPointType, Type::FLOAT, float>¶
- Concrete type class for 32-bit floating-point data (C “float”) 
- 
class DoubleType: public arrow::detail::CTypeImpl<DoubleType, FloatingPointType, Type::DOUBLE, double>¶
- Concrete type class for 64-bit floating-point data (C “double”) 
Binary-like¶
- 
class arrow::BinaryType: public arrow::BaseBinaryType¶
- Concrete type class for variable-size binary data. - Subclassed by arrow::StringType - Public Functions - 
DataTypeLayout layout() const override¶
- Return the data type layout. - Children are not included. - Note
- Experimental API 
 
 - 
std::string ToString() const override¶
- A string representation of the type, including any children. 
 - 
std::string name() const override¶
- A string name of the type, omitting any child fields. - Note
- Experimental API 
- Since
- 0.7.0 
 
 
- 
DataTypeLayout 
- 
class arrow::StringType: public arrow::BinaryType¶
- Concrete type class for variable-size string data, utf8-encoded. 
- 
class arrow::FixedSizeBinaryType: public arrow::FixedWidthType, public arrow::ParametricType¶
- Concrete type class for fixed-size binary data. - Subclassed by arrow::DecimalType - Public Functions - 
std::string ToString() const override¶
- A string representation of the type, including any children. 
 - 
std::string name() const override¶
- A string name of the type, omitting any child fields. - Note
- Experimental API 
- Since
- 0.7.0 
 
 - 
DataTypeLayout layout() const override¶
- Return the data type layout. - Children are not included. - Note
- Experimental API 
 
 
- 
std::string 
- 
class arrow::Decimal128Type: public arrow::DecimalType¶
- Concrete type class for 128-bit decimal data. - Public Functions - 
Decimal128Type(int32_t precision, int32_t scale)¶
- Decimal128Type constructor that aborts on invalid input. 
 - 
std::string ToString() const override¶
- A string representation of the type, including any children. 
 - 
std::string name() const override¶
- A string name of the type, omitting any child fields. - Note
- Experimental API 
- Since
- 0.7.0 
 
 - Public Static Functions - 
Result<std::shared_ptr<DataType>> Make(int32_t precision, int32_t scale)¶
- Decimal128Type constructor that returns an error on invalid input. 
 
- 
Nested¶
- 
class arrow::ListType: public arrow::BaseListType¶
- Concrete type class for list data. - List data is nested data where each value is a variable number of child items. Lists can be recursively nested, for example list(list(int32)). - Subclassed by arrow::MapType - Public Functions - 
DataTypeLayout layout() const override¶
- Return the data type layout. - Children are not included. - Note
- Experimental API 
 
 - 
std::string ToString() const override¶
- A string representation of the type, including any children. 
 - 
std::string name() const override¶
- A string name of the type, omitting any child fields. - Note
- Experimental API 
- Since
- 0.7.0 
 
 
- 
DataTypeLayout 
- 
class arrow::MapType: public arrow::ListType¶
- Concrete type class for map data. - Map data is nested data where each value is a variable number of key-item pairs. Maps can be recursively nested, for example map(utf8, map(utf8, int32)). 
- 
class arrow::StructType: public arrow::NestedType¶
- Concrete type class for struct data. - Public Functions - 
DataTypeLayout layout() const override¶
- Return the data type layout. - Children are not included. - Note
- Experimental API 
 
 - 
std::string ToString() const override¶
- A string representation of the type, including any children. 
 - 
std::string name() const override¶
- A string name of the type, omitting any child fields. - Note
- Experimental API 
- Since
- 0.7.0 
 
 - 
std::shared_ptr<Field> GetFieldByName(const std::string &name) const¶
- Returns null if name not found. 
 - 
std::vector<std::shared_ptr<Field>> GetAllFieldsByName(const std::string &name) const¶
- Return all fields having this name. 
 - 
int GetFieldIndex(const std::string &name) const¶
- Returns -1 if name not found or if there are multiple fields having the same name. 
 - 
std::vector<int> GetAllFieldIndices(const std::string &name) const¶
- Return the indices of all fields having this name in sorted order. 
 
- 
DataTypeLayout 
- 
class arrow::UnionType: public arrow::NestedType¶
- Concrete type class for union data. - Subclassed by arrow::DenseUnionType, arrow::SparseUnionType - Public Functions - 
DataTypeLayout layout() const override¶
- Return the data type layout. - Children are not included. - Note
- Experimental API 
 
 - 
std::string ToString() const override¶
- A string representation of the type, including any children. 
 - 
const std::vector<int8_t> &type_codes() const¶
- The array of logical type ids. - For example, the first type in the union might be denoted by the id 5 (instead of 0). 
 - 
const std::vector<int> &child_ids() const¶
- An array mapping logical type ids to physical child ids. 
 
- 
DataTypeLayout 
Dictionary-encoded¶
- 
class arrow::DictionaryType: public arrow::FixedWidthType¶
- Dictionary-encoded value type with data-dependent dictionary. - Indices are represented by any integer types. - Public Functions - 
std::string ToString() const override¶
- A string representation of the type, including any children. 
 - 
std::string name() const override¶
- A string name of the type, omitting any child fields. - Note
- Experimental API 
- Since
- 0.7.0 
 
 - 
DataTypeLayout layout() const override¶
- Return the data type layout. - Children are not included. - Note
- Experimental API 
 
 
- 
std::string 
Fields and Schemas¶
- Create a Field instance. - Parameters
- name: the field name
- type: the field value type
- nullable: whether the values are nullable, default true
- metadata: any custom key-value metadata, default null
 
 
- Create a Schema instance. - Return
- schema shared_ptr to Schema 
- Parameters
- fields: the schema’s fields
- metadata: any custom key-value metadata, default null
 
 
- 
class arrow::Field: public arrow::detail::Fingerprintable¶
- The combination of a field name and data type, with optional metadata. - Fields are used to describe the individual constituents of a nested DataType or a Schema. - A field’s metadata is represented by a KeyValueMetadata instance, which holds arbitrary key-value pairs. - Public Functions - 
std::shared_ptr<const KeyValueMetadata> metadata() const¶
- Return the field’s attached metadata. 
 - 
bool HasMetadata() const¶
- Return whether the field has non-empty metadata. 
 - Return a copy of this field with the given metadata attached to it. 
 - EXPERIMENTAL: Return a copy of this field with the given metadata merged with existing metadata (any colliding keys will be overridden by the passed metadata) 
 - 
std::shared_ptr<Field> RemoveMetadata() const¶
- Return a copy of this field without any metadata attached to it. 
 - Return a copy of this field with the replaced type. 
 - 
std::shared_ptr<Field> WithName(const std::string &name) const¶
- Return a copy of this field with the replaced name. 
 - 
std::shared_ptr<Field> WithNullable(bool nullable) const¶
- Return a copy of this field with the replaced nullability. 
 - 
Result<std::shared_ptr<Field>> MergeWith(const Field &other, MergeOptions options = MergeOptions::Defaults()) const¶
- Merge the current field with a field of the same name. - The two fields must be compatible, i.e: - have the same name 
- have the same type, or of compatible types according to - options.
 - The metadata of the current field is preserved; the metadata of the other field is discarded. 
 - 
bool Equals(const Field &other, bool check_metadata = false) const¶
- Indicate if fields are equals. - Return
- true if fields are equal, false otherwise. 
- Parameters
- [in] other: field to check equality with.
- [in] check_metadata: controls if it should check for metadata equality.
 
 
 - 
bool IsCompatibleWith(const Field &other) const¶
- Indicate if fields are compatibles. - See the criteria of MergeWith. - Return
- true if fields are compatible, false otherwise. 
 
 - 
std::string ToString(bool show_metadata = false) const¶
- Return a string representation ot the field. - Parameters
- [in] show_metadata: when true, if KeyValueMetadata is non-empty, print keys and values in the output
 
 
 - 
const std::string &name() const¶
- Return the field name. 
 - 
bool nullable() const¶
- Return whether the field is nullable. 
 - 
struct MergeOptions¶
- Options that control the behavior of - MergeWith.- Options are to be added to allow type conversions, including integer widening, promotion from integer to float, or conversion to or from boolean. 
 
- 
std::shared_ptr<const KeyValueMetadata> 
- 
class arrow::Schema: public arrow::detail::Fingerprintable, public arrow::util::EqualityComparable<Schema>, public arrow::util::ToStringOstreamable<Schema>¶
- Sequence of arrow::Field objects describing the columns of a record batch or table data structure. - Public Functions - 
bool Equals(const Schema &other, bool check_metadata = false) const¶
- Returns true if all of the schema fields are equal. 
 - 
int num_fields() const¶
- Return the number of fields (columns) in the schema. 
 - 
const std::shared_ptr<Field> &field(int i) const¶
- Return the ith schema element. Does not boundscheck. 
 - 
std::shared_ptr<Field> GetFieldByName(const std::string &name) const¶
- Returns null if name not found. 
 - 
std::vector<std::shared_ptr<Field>> GetAllFieldsByName(const std::string &name) const¶
- Return the indices of all fields having this name in sorted order. 
 - 
int GetFieldIndex(const std::string &name) const¶
- Returns -1 if name not found. 
 - 
std::vector<int> GetAllFieldIndices(const std::string &name) const¶
- Return the indices of all fields having this name. 
 - 
Status CanReferenceFieldsByNames(const std::vector<std::string> &names) const¶
- Indicate if fields named - namescan be found unambiguously in the schema.
 - 
const std::shared_ptr<const KeyValueMetadata> &metadata() const¶
- The custom key-value metadata, if any. - Return
- metadata may be null 
 
 - 
std::string ToString(bool show_metadata = false) const¶
- Render a string representation of the schema suitable for debugging. - Parameters
- [in] show_metadata: when true, if KeyValueMetadata is non-empty, print keys and values in the output
 
 
 - Replace key-value metadata with new metadata. - Return
- new Schema 
- Parameters
- [in] metadata: new KeyValueMetadata
 
 
 
- 
bool 
