mabel.data.writers.batch_writer

CLASS: BatchWriter ()

The batch data writer to writes data records into blobs. Batches are written into timestamped folders called Partitions.

dataset - string (optional)
The name of the dataset - this is used to map to a path
schema - mabel.validator.Schema (optional)
Schema used to test records for conformity, default is no schema and therefore no validation
format - string (optional)
text: raw text lines - jsonl: raw json lines - flat: flattened json records in json lines - lzma: lzma compressed json lines - zstd: zstandard compressed json lines (default) - parquet: Apache Parquet
date - date or string (optional)
A date, a string representation of a date to use for creating the dataset. The default is today's date
blob_size - integer (optional)
The maximum size of blobs, the default is 64Mb
inner_writer - BaseWriter (optional)
The component used to commit data, the default writer is the NullWriter
frame_id - string (optional)

raw_path: boolean (optional) Don't automatically add any date parts to dataset names

index_on - collection (optional)
Index on these columns, the default is to not index
metadata - dict (optional)
data to write into the frame.complete file

This file has been automatically generated, it is not the truth. If in doubt the code will tell you unambiguously what it does.