Streams and File Access

Factory Functions

These factory functions are the recommended way to create a Arrow stream. They accept various kinds of sources, such as in-memory buffers or on-disk files.

input_stream(source[, compression, buffer_size])

Create an Arrow input stream.

output_stream(source[, compression, buffer_size])

Create an Arrow output stream.

memory_map(path[, mode])

Open memory map at file path.

create_memory_map(path, size)

Create a file of the given size and memory-map it.

Stream Classes

NativeFile

The base class for all Arrow streams.

OSFile

A stream backed by a regular file descriptor.

PythonFile

A stream backed by a Python file object.

BufferReader

Zero-copy reader from objects convertible to Arrow buffer.

BufferOutputStream

FixedSizeBufferWriter

A stream writing to a Arrow buffer.

MemoryMappedFile

A stream that represents a memory-mapped file.

CompressedInputStream(NativeFile stream, …)

An input stream wrapper which decompresses data on the fly.

CompressedOutputStream(NativeFile stream, …)

An output stream wrapper which compresses data on the fly.

File Systems

hdfs.connect([host, port, user, …])

Connect to an HDFS cluster.

LocalFileSystem()

class pyarrow.HadoopFileSystem[source]