pyarrow.dataset.FileSystemFactoryOptions¶
-
class
pyarrow.dataset.
FileSystemFactoryOptions
¶ Bases:
pyarrow.lib._Weakrefable
Influences the discovery of filesystem paths.
- Parameters
partition_base_dir (str, optional) – For the purposes of applying the partitioning, paths will be stripped of the partition_base_dir. Files not matching the partition_base_dir prefix will be skipped for partitioning discovery. The ignored files will still be part of the Dataset, but will not have partition information.
partitioning (Partitioning/PartitioningFactory, optional) – Apply the Partitioning to every discovered Fragment. See Partitioning or PartitioningFactory documentation.
exclude_invalid_files (bool, optional (default True)) – If True, invalid files will be excluded (file format specific check). This will incur IO for each files in a serial and single threaded fashion. Disabling this feature will skip the IO, but unsupported files may be present in the Dataset (resulting in an error at scan time).
selector_ignore_prefixes (list, optional) – When discovering from a Selector (and not from an explicit file list), ignore files and directories matching any of these prefixes. By default this is [‘.’, ‘_’].
-
__init__
(*args, **kwargs)¶ Initialize self. See help(type(self)) for accurate signature.
Methods
__init__
(*args, **kwargs)Initialize self.
Attributes
Whether to exclude invalid files.
Base directory to strip paths before applying the partitioning.
Partitioning to apply to discovered files.
PartitioningFactory to apply to discovered files and discover a Partitioning.
List of prefixes.
-
exclude_invalid_files
¶ Whether to exclude invalid files.
-
partition_base_dir
¶ Base directory to strip paths before applying the partitioning.
-
partitioning
¶ Partitioning to apply to discovered files.
NOTE: setting this property will overwrite partitioning_factory.
-
partitioning_factory
¶ PartitioningFactory to apply to discovered files and discover a Partitioning.
NOTE: setting this property will overwrite partitioning.
-
selector_ignore_prefixes
¶ List of prefixes. Files matching one of those prefixes will be ignored by the discovery process.