public class RecordReaderFileBatchLoader extends Object implements DataSetLoader
RecordReaderDataSetIterator
which it uses internally.
Can be used in the context of Spark - see SparkDataUtils methods for this purposeConstructor and Description |
---|
RecordReaderFileBatchLoader(RecordReader recordReader,
int batchSize,
int labelIndex,
int numClasses)
Main constructor for classification.
|
RecordReaderFileBatchLoader(RecordReader recordReader,
int batchSize,
int labelIndexFrom,
int labelIndexTo,
boolean regression)
Main constructor for multi-label regression (i.e., regression with multiple outputs).
|
RecordReaderFileBatchLoader(RecordReader recordReader,
int batchSize,
int labelIndexFrom,
int labelIndexTo,
int numPossibleLabels,
boolean regression,
DataSetPreProcessor preProcessor)
Main constructor
|
public RecordReaderFileBatchLoader(RecordReader recordReader, int batchSize, int labelIndex, int numClasses)
recordReader
- RecordReader: provides the source of the databatchSize
- Batch size (number of examples) for the output DataSet objectslabelIndex
- Index of the label Writable (usually an IntWritable), as obtained by recordReader.next()numClasses
- Number of classes (possible labels) for classificationpublic RecordReaderFileBatchLoader(RecordReader recordReader, int batchSize, int labelIndexFrom, int labelIndexTo, boolean regression)
recordReader
- RecordReader to get data fromlabelIndexFrom
- Index of the first regression targetlabelIndexTo
- Index of the last regression target, inclusivebatchSize
- Minibatch sizeregression
- Require regression = true. Mainly included to avoid clashing with other constructors previously defined :/public RecordReaderFileBatchLoader(RecordReader recordReader, int batchSize, int labelIndexFrom, int labelIndexTo, int numPossibleLabels, boolean regression, DataSetPreProcessor preProcessor)
recordReader
- the recordreader to usebatchSize
- Minibatch size - number of examples returned for each call of .next()labelIndexFrom
- the index of the label (for classification), or the first index of the labels for multi-output regressionlabelIndexTo
- only used if regression == true. The last index inclusive of the multi-output regressionnumPossibleLabels
- the number of possible labels for classification. Not used if regression == trueregression
- if true: regression. If false: classification (assume labelIndexFrom is the class it belongs to)preProcessor
- Optional DataSetPreProcessor. May be null.public DataSet load(Source source) throws IOException
load
in interface Loader<DataSet>
IOException
Copyright © 2020. All rights reserved.