pyarrow.csv.ReadOptions¶

class pyarrow.csv.ReadOptions(use_threads=None, *, block_size=None, skip_rows=None, column_names=None, autogenerate_column_names=None, encoding='utf8')¶

Bases: pyarrow.lib._Weakrefable

Options for reading CSV files.

Parameters

use_threads (bool, optional (default True)) – Whether to use multiple threads to accelerate reading
block_size (int, optional) – How much bytes to process at a time from the input stream. This will determine multi-threading granularity as well as the size of individual chunks in the Table.
skip_rows (int, optional (default 0)) – The number of rows to skip before the column names (if any) and the CSV data.
column_names (list, optional) – The column names of the target table. If empty, fall back on autogenerate_column_names.
autogenerate_column_names (bool, optional (default False)) – Whether to autogenerate column names if column_names is empty. If true, column names will be of the form “f0”, “f1”… If false, column names will be read from the first CSV row after skip_rows.
encoding (str, optional (default 'utf8')) – The character encoding of the CSV data. Columns that cannot decode using this encoding can still be read as Binary.

__init__(*args, **kwargs)¶: Initialize self. See help(type(self)) for accurate signature.

Methods

__init__(*args, **kwargs)

Initialize self.

Attributes

`autogenerate_column_names`	Whether to autogenerate column names if column_names is empty.
`block_size`	How much bytes to process at a time from the input stream.
`column_names`	The column names of the target table.
`encoding`	object
`skip_rows`	The number of rows to skip before the column names (if any) and the CSV data.
`use_threads`	Whether to use multiple threads to accelerate reading.

autogenerate_column_names¶: Whether to autogenerate column names if column_names is empty. If true, column names will be of the form “f0”, “f1”… If false, column names will be read from the first CSV row after skip_rows.

block_size¶: How much bytes to process at a time from the input stream. This will determine multi-threading granularity as well as the size of individual chunks in the Table.

column_names¶: The column names of the target table. If empty, fall back on autogenerate_column_names.

encoding¶

object

Type: encoding

skip_rows¶: The number of rows to skip before the column names (if any) and the CSV data.

use_threads¶: Whether to use multiple threads to accelerate reading.

Tabular File Formats pyarrow.csv.ParseOptions