Contents Menu Expand
logo
Specifications and Protocols Implementation Status C/GLib C++ C# Go Java JavaScript Julia MATLAB Python R Ruby Rust Development
  • Installing PyArrow
  • Memory and IO Interfaces
  • Data Types and In-Memory Data Model
  • Compute Functions
  • Streaming, Serialization, and IPC
  • Filesystem Interface
  • Filesystem Interface (legacy)
    • pyarrow.hdfs.connect
    • pyarrow.HadoopFileSystem.cat
    • pyarrow.HadoopFileSystem.chmod
    • pyarrow.HadoopFileSystem.chown
    • pyarrow.HadoopFileSystem.delete
    • pyarrow.HadoopFileSystem.df
    • pyarrow.HadoopFileSystem.disk_usage
    • pyarrow.HadoopFileSystem.download
    • pyarrow.HadoopFileSystem.exists
    • pyarrow.HadoopFileSystem.get_capacity
    • pyarrow.HadoopFileSystem.get_space_used
    • pyarrow.HadoopFileSystem.info
    • pyarrow.HadoopFileSystem.ls
    • pyarrow.HadoopFileSystem.mkdir
    • pyarrow.HadoopFileSystem.open
    • pyarrow.HadoopFileSystem.rename
    • pyarrow.HadoopFileSystem.rm
    • pyarrow.HadoopFileSystem.upload
    • pyarrow.HdfsFile
  • The Plasma In-Memory Object Store
  • NumPy Integration
  • Pandas Integration
  • Timestamps
  • Reading CSV files
  • Feather File Format
  • Reading JSON files
  • Reading and Writing the Apache Parquet Format
  • Tabular Datasets
  • CUDA Integration
  • Extending pyarrow
  • Using pyarrow from C++ and Cython Code
  • API Reference
    • Data Types and Schemas
      • pyarrow.null
      • pyarrow.bool_
      • pyarrow.int8
      • pyarrow.int16
      • pyarrow.int32
      • pyarrow.int64
      • pyarrow.uint8
      • pyarrow.uint16
      • pyarrow.uint32
      • pyarrow.uint64
      • pyarrow.float16
      • pyarrow.float32
      • pyarrow.float64
      • pyarrow.time32
      • pyarrow.time64
      • pyarrow.timestamp
      • pyarrow.date32
      • pyarrow.date64
      • pyarrow.binary
      • pyarrow.string
      • pyarrow.utf8
      • pyarrow.large_binary
      • pyarrow.large_string
      • pyarrow.large_utf8
      • pyarrow.decimal128
      • pyarrow.list_
      • pyarrow.large_list
      • pyarrow.map_
      • pyarrow.struct
      • pyarrow.dictionary
      • pyarrow.field
      • pyarrow.schema
      • pyarrow.from_numpy_dtype
      • pyarrow.DataType
      • pyarrow.DictionaryType
      • pyarrow.ListType
      • pyarrow.MapType
      • pyarrow.StructType
      • pyarrow.UnionType
      • pyarrow.TimestampType
      • pyarrow.Time32Type
      • pyarrow.Time64Type
      • pyarrow.FixedSizeBinaryType
      • pyarrow.Decimal128Type
      • pyarrow.Field
      • pyarrow.Schema
      • pyarrow.ExtensionType
      • pyarrow.PyExtensionType
      • pyarrow.register_extension_type
      • pyarrow.unregister_extension_type
      • pyarrow.types.is_boolean
      • pyarrow.types.is_integer
      • pyarrow.types.is_signed_integer
      • pyarrow.types.is_unsigned_integer
      • pyarrow.types.is_int8
      • pyarrow.types.is_int16
      • pyarrow.types.is_int32
      • pyarrow.types.is_int64
      • pyarrow.types.is_uint8
      • pyarrow.types.is_uint16
      • pyarrow.types.is_uint32
      • pyarrow.types.is_uint64
      • pyarrow.types.is_floating
      • pyarrow.types.is_float16
      • pyarrow.types.is_float32
      • pyarrow.types.is_float64
      • pyarrow.types.is_decimal
      • pyarrow.types.is_list
      • pyarrow.types.is_large_list
      • pyarrow.types.is_struct
      • pyarrow.types.is_union
      • pyarrow.types.is_nested
      • pyarrow.types.is_temporal
      • pyarrow.types.is_timestamp
      • pyarrow.types.is_date
      • pyarrow.types.is_date32
      • pyarrow.types.is_date64
      • pyarrow.types.is_time
      • pyarrow.types.is_time32
      • pyarrow.types.is_time64
      • pyarrow.types.is_null
      • pyarrow.types.is_binary
      • pyarrow.types.is_unicode
      • pyarrow.types.is_string
      • pyarrow.types.is_large_binary
      • pyarrow.types.is_large_unicode
      • pyarrow.types.is_large_string
      • pyarrow.types.is_fixed_size_binary
      • pyarrow.types.is_map
      • pyarrow.types.is_dictionary
    • Arrays and Scalars
      • pyarrow.array
      • pyarrow.nulls
      • pyarrow.Array
      • pyarrow.BooleanArray
      • pyarrow.FloatingPointArray
      • pyarrow.IntegerArray
      • pyarrow.Int8Array
      • pyarrow.Int16Array
      • pyarrow.Int32Array
      • pyarrow.Int64Array
      • pyarrow.NullArray
      • pyarrow.NumericArray
      • pyarrow.UInt8Array
      • pyarrow.UInt16Array
      • pyarrow.UInt32Array
      • pyarrow.UInt64Array
      • pyarrow.BinaryArray
      • pyarrow.StringArray
      • pyarrow.FixedSizeBinaryArray
      • pyarrow.LargeBinaryArray
      • pyarrow.LargeStringArray
      • pyarrow.Time32Array
      • pyarrow.Time64Array
      • pyarrow.Date32Array
      • pyarrow.Date64Array
      • pyarrow.TimestampArray
      • pyarrow.Decimal128Array
      • pyarrow.DictionaryArray
      • pyarrow.ListArray
      • pyarrow.LargeListArray
      • pyarrow.StructArray
      • pyarrow.UnionArray
      • pyarrow.ExtensionArray
      • pyarrow.scalar
      • pyarrow.NA
      • pyarrow.Scalar
      • pyarrow.BooleanScalar
      • pyarrow.Int8Scalar
      • pyarrow.Int16Scalar
      • pyarrow.Int32Scalar
      • pyarrow.Int64Scalar
      • pyarrow.UInt8Scalar
      • pyarrow.UInt16Scalar
      • pyarrow.UInt32Scalar
      • pyarrow.UInt64Scalar
      • pyarrow.FloatScalar
      • pyarrow.DoubleScalar
      • pyarrow.BinaryScalar
      • pyarrow.StringScalar
      • pyarrow.FixedSizeBinaryScalar
      • pyarrow.LargeBinaryScalar
      • pyarrow.LargeStringScalar
      • pyarrow.Time32Scalar
      • pyarrow.Time64Scalar
      • pyarrow.Date32Scalar
      • pyarrow.Date64Scalar
      • pyarrow.TimestampScalar
      • pyarrow.Decimal128Scalar
      • pyarrow.DictionaryScalar
      • pyarrow.ListScalar
      • pyarrow.LargeListScalar
      • pyarrow.StructScalar
      • pyarrow.UnionScalar
    • Buffers and Memory
      • pyarrow.allocate_buffer
      • pyarrow.py_buffer
      • pyarrow.foreign_buffer
      • pyarrow.Buffer
      • pyarrow.ResizableBuffer
      • pyarrow.compress
      • pyarrow.decompress
      • pyarrow.MemoryPool
      • pyarrow.default_memory_pool
      • pyarrow.jemalloc_memory_pool
      • pyarrow.mimalloc_memory_pool
      • pyarrow.system_memory_pool
      • pyarrow.jemalloc_set_decay_ms
      • pyarrow.set_memory_pool
      • pyarrow.log_memory_allocations
      • pyarrow.total_allocated_bytes
    • Compute Functions
      • pyarrow.compute.count
      • pyarrow.compute.mean
      • pyarrow.compute.min_max
      • pyarrow.compute.mode
      • pyarrow.compute.stddev
      • pyarrow.compute.sum
      • pyarrow.compute.variance
      • pyarrow.compute.add
      • pyarrow.compute.add_checked
      • pyarrow.compute.divide
      • pyarrow.compute.divide_checked
      • pyarrow.compute.multiply
      • pyarrow.compute.multiply_checked
      • pyarrow.compute.subtract
      • pyarrow.compute.subtract_checked
      • pyarrow.compute.equal
      • pyarrow.compute.greater
      • pyarrow.compute.greater_equal
      • pyarrow.compute.less
      • pyarrow.compute.less_equal
      • pyarrow.compute.not_equal
      • pyarrow.compute.and_
      • pyarrow.compute.and_kleene
      • pyarrow.compute.all
      • pyarrow.compute.any
      • pyarrow.compute.invert
      • pyarrow.compute.or_
      • pyarrow.compute.or_kleene
      • pyarrow.compute.xor
      • pyarrow.compute.ascii_is_alnum
      • pyarrow.compute.ascii_is_alpha
      • pyarrow.compute.ascii_is_decimal
      • pyarrow.compute.ascii_is_lower
      • pyarrow.compute.ascii_is_printable
      • pyarrow.compute.ascii_is_space
      • pyarrow.compute.ascii_is_upper
      • pyarrow.compute.utf8_is_alnum
      • pyarrow.compute.utf8_is_alpha
      • pyarrow.compute.utf8_is_decimal
      • pyarrow.compute.utf8_is_digit
      • pyarrow.compute.utf8_is_lower
      • pyarrow.compute.utf8_is_numeric
      • pyarrow.compute.utf8_is_printable
      • pyarrow.compute.utf8_is_space
      • pyarrow.compute.utf8_is_upper
      • pyarrow.compute.ascii_is_title
      • pyarrow.compute.utf8_is_title
      • pyarrow.compute.string_is_ascii
      • pyarrow.compute.ascii_lower
      • pyarrow.compute.ascii_upper
      • pyarrow.compute.utf8_lower
      • pyarrow.compute.utf8_upper
      • pyarrow.compute.index_in
      • pyarrow.compute.is_in
      • pyarrow.compute.match_substring
      • pyarrow.compute.cast
      • pyarrow.compute.strptime
      • pyarrow.compute.filter
      • pyarrow.compute.take
      • pyarrow.compute.dictionary_encode
      • pyarrow.compute.unique
      • pyarrow.compute.value_counts
      • pyarrow.compute.partition_nth_indices
      • pyarrow.compute.sort_indices
      • pyarrow.compute.binary_length
      • pyarrow.compute.fill_null
      • pyarrow.compute.is_null
      • pyarrow.compute.is_valid
      • pyarrow.compute.list_value_length
      • pyarrow.compute.list_flatten
      • pyarrow.compute.list_parent_indices
    • Streams and File Access
      • pyarrow.input_stream
      • pyarrow.output_stream
      • pyarrow.memory_map
      • pyarrow.create_memory_map
      • pyarrow.NativeFile
      • pyarrow.OSFile
      • pyarrow.PythonFile
      • pyarrow.BufferReader
      • pyarrow.BufferOutputStream
      • pyarrow.FixedSizeBufferWriter
      • pyarrow.MemoryMappedFile
      • pyarrow.CompressedInputStream
      • pyarrow.CompressedOutputStream
      • pyarrow.hdfs.connect
      • pyarrow.LocalFileSystem
    • Tables and Tensors
      • pyarrow.chunked_array
      • pyarrow.concat_arrays
      • pyarrow.concat_tables
      • pyarrow.record_batch
      • pyarrow.table
      • pyarrow.ChunkedArray
      • pyarrow.RecordBatch
      • pyarrow.Table
      • pyarrow.Tensor
    • Serialization and IPC
      • pyarrow.ipc.new_file
      • pyarrow.ipc.open_file
      • pyarrow.ipc.new_stream
      • pyarrow.ipc.open_stream
      • pyarrow.ipc.read_message
      • pyarrow.ipc.read_record_batch
      • pyarrow.ipc.get_record_batch_size
      • pyarrow.ipc.read_tensor
      • pyarrow.ipc.write_tensor
      • pyarrow.ipc.get_tensor_size
      • pyarrow.ipc.Message
      • pyarrow.ipc.MessageReader
      • pyarrow.ipc.RecordBatchFileReader
      • pyarrow.ipc.RecordBatchFileWriter
      • pyarrow.ipc.RecordBatchStreamReader
      • pyarrow.ipc.RecordBatchStreamWriter
      • pyarrow.serialize
      • pyarrow.serialize_to
      • pyarrow.deserialize
      • pyarrow.deserialize_components
      • pyarrow.deserialize_from
      • pyarrow.read_serialized
      • pyarrow.SerializedPyObject
      • pyarrow.SerializationContext
    • Arrow Flight
      • pyarrow.flight.Action
      • pyarrow.flight.ActionType
      • pyarrow.flight.DescriptorType
      • pyarrow.flight.FlightDescriptor
      • pyarrow.flight.FlightEndpoint
      • pyarrow.flight.FlightInfo
      • pyarrow.flight.Location
      • pyarrow.flight.Ticket
      • pyarrow.flight.Result
      • pyarrow.flight.FlightCallOptions
      • pyarrow.flight.FlightClient
      • pyarrow.flight.ClientMiddlewareFactory
      • pyarrow.flight.ClientMiddleware
      • pyarrow.flight.FlightServerBase
      • pyarrow.flight.GeneratorStream
      • pyarrow.flight.RecordBatchStream
      • pyarrow.flight.ServerMiddlewareFactory
      • pyarrow.flight.ServerMiddleware
      • pyarrow.flight.ClientAuthHandler
      • pyarrow.flight.ServerAuthHandler
      • pyarrow.flight.FlightMethod
      • pyarrow.flight.CallInfo
    • Tabular File Formats
      • pyarrow.csv.ReadOptions
      • pyarrow.csv.ParseOptions
      • pyarrow.csv.ConvertOptions
      • pyarrow.csv.read_csv
      • pyarrow.csv.open_csv
      • pyarrow.csv.CSVStreamingReader
      • pyarrow.feather.read_feather
      • pyarrow.feather.read_table
      • pyarrow.feather.write_feather
      • pyarrow.json.ReadOptions
      • pyarrow.json.ParseOptions
      • pyarrow.json.read_json
      • pyarrow.parquet.ParquetDataset
      • pyarrow.parquet.ParquetFile
      • pyarrow.parquet.ParquetWriter
      • pyarrow.parquet.read_table
      • pyarrow.parquet.read_metadata
      • pyarrow.parquet.read_pandas
      • pyarrow.parquet.read_schema
      • pyarrow.parquet.write_metadata
      • pyarrow.parquet.write_table
      • pyarrow.parquet.write_to_dataset
    • Filesystems
      • pyarrow.fs.FileInfo
      • pyarrow.fs.FileSelector
      • pyarrow.fs.FileSystem
      • pyarrow.fs.LocalFileSystem
      • pyarrow.fs.S3FileSystem
      • pyarrow.fs.SubTreeFileSystem
      • pyarrow.fs.PyFileSystem
      • pyarrow.fs.FileSystemHandler
      • pyarrow.fs.FSSpecHandler
    • Dataset
      • pyarrow.dataset.dataset
      • pyarrow.dataset.parquet_dataset
      • pyarrow.dataset.partitioning
      • pyarrow.dataset.field
      • pyarrow.dataset.scalar
      • pyarrow.dataset.FileFormat
      • pyarrow.dataset.ParquetFileFormat
      • pyarrow.dataset.Partitioning
      • pyarrow.dataset.PartitioningFactory
      • pyarrow.dataset.DirectoryPartitioning
      • pyarrow.dataset.HivePartitioning
      • pyarrow.dataset.Dataset
      • pyarrow.dataset.FileSystemDataset
      • pyarrow.dataset.FileSystemFactoryOptions
      • pyarrow.dataset.FileSystemDatasetFactory
      • pyarrow.dataset.UnionDataset
      • pyarrow.dataset.Scanner
      • pyarrow.dataset.Expression
    • Plasma In-Memory Object Store
    • CUDA Integration
      • pyarrow.cuda.Context
      • pyarrow.cuda.CudaBuffer
      • pyarrow.cuda.new_host_buffer
      • pyarrow.cuda.HostBuffer
      • pyarrow.cuda.BufferReader
      • pyarrow.cuda.BufferWriter
      • pyarrow.cuda.serialize_record_batch
      • pyarrow.cuda.read_record_batch
      • pyarrow.cuda.read_message
      • pyarrow.cuda.IpcMemHandle
    • Miscellaneous
      • pyarrow.cpu_count
      • pyarrow.set_cpu_count
      • pyarrow.get_include
      • pyarrow.get_libraries
      • pyarrow.get_library_dirs
  • Getting Involved
  • Benchmarks

API ReferenceΒΆ

  • Data Types and Schemas
    • Factory Functions
    • Type Classes
    • Type Checking
  • Arrays and Scalars
    • Factory Functions
    • Array Types
    • Scalars
  • Buffers and Memory
    • In-Memory Buffers
    • Memory Pools
  • Compute Functions
    • Aggregations
    • Arithmetic Functions
    • Comparisons
    • Logical Functions
    • String Predicates
    • String Transforms
    • Containment tests
    • Conversions
    • Selections
    • Associative transforms
    • Sorts and partitions
    • Structural Transforms
  • Streams and File Access
    • Factory Functions
    • Stream Classes
    • File Systems
  • Tables and Tensors
    • Factory Functions
    • Classes
    • Tensors
  • Serialization and IPC
    • Inter-Process Communication
    • Serialization
  • Arrow Flight
    • Common Types
    • Flight Client
    • Flight Server
    • Authentication
    • Middleware
  • Tabular File Formats
    • CSV Files
    • Feather Files
    • JSON Files
    • Parquet Files
    • ORC Files
  • Filesystems
    • Interface
    • Concrete Subclasses
  • Dataset
    • Factory functions
    • Classes
  • Plasma In-Memory Object Store
    • Classes
  • CUDA Integration
    • CUDA Contexts
    • CUDA Buffers
    • Serialization and IPC
  • Miscellaneous
    • Multi-Threading
    • Using with C extensions
Using pyarrow from C++ and Cython Code Data Types and Schemas

© Copyright 2016-2019 Apache Software Foundation.
Created using Sphinx 3.1.2.