T
- the type of the data stored by the columns.interface BatchedColumnReader<T> extends ColumnReader<T>
Processor
s that collect the values parsed from each column in a row and store values of columns in batches.
Use implementations of this interface implementation in favor of ColumnReader
when processing large inputs to avoid running out of memory.
During the execution of the process, the batchProcessed(int)
method will be invoked after a given number of rows has been processed.
The user can access the lists with values parsed for all columns using the methods ColumnReader.getColumnValuesAsList()
,
ColumnReader.getColumnValuesAsMapOfIndexes()
and ColumnReader.getColumnValuesAsMapOfNames()
.
After batchProcessed(int)
is invoked, all values will be discarded and the next batch of column values will be accumulated.
This process will repeat until there's no more rows in the input.
Modifier and Type | Method and Description |
---|---|
void |
batchProcessed(int rowsInThisBatch)
Callback to the user, where the lists with values parsed for all columns can be accessed using the methods
ColumnReader.getColumnValuesAsList() ,
ColumnReader.getColumnValuesAsMapOfIndexes() and ColumnReader.getColumnValuesAsMapOfNames() . |
int |
getBatchesProcessed()
Returns the number of batches already processed
|
int |
getRowsPerBatch()
Returns the number of rows processed in each batch
|
getColumn, getColumn, getColumnValuesAsList, getColumnValuesAsMapOfIndexes, getColumnValuesAsMapOfNames, getHeaders, putColumnValuesInMapOfIndexes, putColumnValuesInMapOfNames
int getRowsPerBatch()
int getBatchesProcessed()
void batchProcessed(int rowsInThisBatch)
ColumnReader.getColumnValuesAsList()
,
ColumnReader.getColumnValuesAsMapOfIndexes()
and ColumnReader.getColumnValuesAsMapOfNames()
.rowsInThisBatch
- the number of rows processed in the current batch. This corresponds to the number of elements of each list of each column.