pyarrow.ipc.RecordBatchStreamWriter¶
-
class
pyarrow.ipc.
RecordBatchStreamWriter
(sink, schema, *, use_legacy_format=None, options=None)[source]¶ Bases:
pyarrow.lib._RecordBatchStreamWriter
Writer for the Arrow streaming binary format
- Parameters
sink (str, pyarrow.NativeFile, or file-like Python object) – Either a file path, or a writable file object.
schema (pyarrow.Schema) – The Arrow schema for data to be written to the file.
options (pyarrow.ipc.IpcWriteOptions) –
Options for IPC serialization.
If None, default values will be used: the legacy format will not be used unless overridden by setting the environment variable ARROW_PRE_0_15_IPC_FORMAT=1, and the V5 metadata version will be used unless overridden by setting the environment variable ARROW_PRE_1_0_METADATA_VERSION=1.
use_legacy_format (bool, default None) –
Deprecated in favor of setting options. Cannot be provided with options.
If None, False will be used unless this default is overridden by setting the environment variable ARROW_PRE_0_15_IPC_FORMAT=1
-
__init__
(sink, schema, *, use_legacy_format=None, options=None)[source]¶ Initialize self. See help(type(self)) for accurate signature.
Methods
__init__
(sink, schema, *[, …])Initialize self.
close
(self)Close stream and write end-of-stream 0 marker.
write
(self, table_or_batch)Write RecordBatch or Table to stream.
write_batch
(self, RecordBatch batch)Write RecordBatch to stream.
write_table
(self, Table table[, max_chunksize])Write Table to stream in (contiguous) RecordBatch objects.
Attributes
Current IPC write statistics.
-
close
(self)¶ Close stream and write end-of-stream 0 marker.
-
stats
¶ Current IPC write statistics.
-
write
(self, table_or_batch)¶ Write RecordBatch or Table to stream.
- Parameters
table_or_batch ({RecordBatch, Table}) –
-
write_batch
(self, RecordBatch batch)¶ Write RecordBatch to stream.
- Parameters
batch (RecordBatch) –
-
write_table
(self, Table table, max_chunksize=None, **kwargs)¶ Write Table to stream in (contiguous) RecordBatch objects.
- Parameters
table (Table) –
max_chunksize (int, default None) – Maximum size for RecordBatch chunks. Individual chunks may be smaller depending on the chunk layout of individual columns.