Modifier and Type | Method and Description |
---|---|
RecordReader |
BaseInputFormat.createReader(InputSplit split) |
RecordReader |
InputFormat.createReader(InputSplit split)
Creates a reader from an input split
|
RecordReader |
InputFormat.createReader(InputSplit split,
Configuration conf)
Creates a reader from an input split
|
Modifier and Type | Method and Description |
---|---|
RecordReader |
CSVInputFormat.createReader(InputSplit split) |
RecordReader |
LineInputFormat.createReader(InputSplit split) |
RecordReader |
ListStringInputFormat.createReader(InputSplit split)
Creates a reader from an input split
|
RecordReader |
MatlabInputFormat.createReader(InputSplit split) |
RecordReader |
SVMLightInputFormat.createReader(InputSplit split) |
RecordReader |
CSVInputFormat.createReader(InputSplit split,
Configuration conf) |
RecordReader |
LibSvmInputFormat.createReader(InputSplit split,
Configuration conf) |
RecordReader |
LineInputFormat.createReader(InputSplit split,
Configuration conf) |
RecordReader |
ListStringInputFormat.createReader(InputSplit split,
Configuration conf)
Creates a reader from an input split
|
RecordReader |
MatlabInputFormat.createReader(InputSplit split,
Configuration conf) |
RecordReader |
SVMLightInputFormat.createReader(InputSplit split,
Configuration conf) |
Modifier and Type | Method and Description |
---|---|
static void |
RecordReaderConverter.convert(RecordReader reader,
RecordWriter writer)
Write all values from the specified record reader to the specified record writer.
|
static void |
RecordReaderConverter.convert(RecordReader reader,
RecordWriter writer,
boolean closeOnCompletion)
Write all values from the specified record reader to the specified record writer.
|
Modifier and Type | Method and Description |
---|---|
void |
RecordListener.recordRead(RecordReader reader,
Object record)
Event listener for each record to be read.
|
Modifier and Type | Method and Description |
---|---|
void |
LogRecordListener.recordRead(RecordReader reader,
Object record) |
Modifier and Type | Interface and Description |
---|---|
interface |
SequenceRecordReader
A sequence of records.
|
Modifier and Type | Class and Description |
---|---|
class |
BaseRecordReader
Manages record listeners.
|
Modifier and Type | Method and Description |
---|---|
RecordReader |
RecordReaderFactory.create(URI uri)
Creates instance of RecordReader
|
Modifier and Type | Class and Description |
---|---|
class |
ComposableRecordReader
RecordReader for each pipeline.
|
class |
ConcatenatingRecordReader
Combine multiple readers into a single reader.
|
class |
FileRecordReader
File reader/writer
|
class |
LineRecordReader
Reads files line by line
|
Constructor and Description |
---|
ComposableRecordReader(RecordReader... readers) |
ConcatenatingRecordReader(RecordReader... readers) |
Modifier and Type | Class and Description |
---|---|
class |
CollectionRecordReader
Collection record reader.
|
class |
CollectionSequenceRecordReader
Collection record reader for sequences.
|
class |
ListStringRecordReader
Iterates through a list of strings return a record.
|
Modifier and Type | Class and Description |
---|---|
class |
CSVLineSequenceRecordReader
CSVLineSequenceRecordReader: Used for loading univariance (single valued) sequences from a CSV,
where each line in a CSV represents an independent sequence, and each sequence has exactly 1 value
per time step.
|
class |
CSVMultiSequenceRecordReader
CSVMultiSequenceRecordReader: Used to read CSV-format time series (sequence) data where there are multiple
independent sequences in each file.
|
class |
CSVNLinesSequenceRecordReader
A CSV Sequence record reader where:
(a) all time series are in a single file (b) each time series is of the same length (specified in constructor) (c) no delimiter is used between time series For example, with nLinesPerSequence=10, lines 0 to 9 are the first time series, 10 to 19 are the second, and so on. |
class |
CSVRecordReader
Simple csv record reader.
|
class |
CSVRegexRecordReader
A CSVRecordReader that can split
each column into additional columns using regexs.
|
class |
CSVSequenceRecordReader
CSV Sequence Record Reader
This reader is intended to read sequences of data in CSV format, where
each sequence is defined in its own file (and there are multiple files)
Each line in the file represents one time step
|
class |
CSVVariableSlidingWindowRecordReader
A sliding window of variable size across an entire CSV.
|
Modifier and Type | Class and Description |
---|---|
class |
FileBatchRecordReader
FileBatchRecordReader reads the files contained in a
FileBatch using the specified RecordReader. |
class |
FileBatchSequenceRecordReader
FileBatchSequenceRecordReader reads the files contained in a
FileBatch using the specified SequenceRecordReader. |
Constructor and Description |
---|
FileBatchRecordReader(RecordReader rr,
FileBatch fileBatch) |
Modifier and Type | Class and Description |
---|---|
class |
InMemoryRecordReader
This is a
RecordReader
primarily meant for unit tests. |
class |
InMemorySequenceRecordReader
This is a
SequenceRecordReader
primarily meant for unit tests. |
Modifier and Type | Class and Description |
---|---|
class |
JacksonLineRecordReader
JacksonLineRecordReader will read a single file line-by-line when .next() is
called. |
class |
JacksonLineSequenceRecordReader
The sequence record reader version of
JacksonLineRecordReader . |
class |
JacksonRecordReader
RecordReader using Jackson.
|
Modifier and Type | Class and Description |
---|---|
class |
LibSvmRecordReader
Record reader for libsvm format, which is closely
related to SVMLight format.
|
class |
MatlabRecordReader
Matlab record reader
|
class |
SVMLightRecordReader
Record reader for SVMLight format, which can generally
be described as
LABEL INDEX:VALUE INDEX:VALUE ...
|
Modifier and Type | Class and Description |
---|---|
class |
RegexLineRecordReader
RegexLineRecordReader: Read a file, one line at a time, and split it into fields using a regex.
|
class |
RegexSequenceRecordReader
RegexSequenceRecordReader: Read an entire file (as a sequence), one line at a time and
split each line into fields using a regex.
|
Modifier and Type | Class and Description |
---|---|
class |
TransformProcessRecordReader
This wraps a
RecordReader
with a TransformProcess and allows every Record
that is returned by the RecordReader
to have a transform process applied before being returned. |
class |
TransformProcessSequenceRecordReader
This wraps a
SequenceRecordReader with a TransformProcess
which will allow every Record returned from the SequenceRecordReader
to be transformed before being returned. |
Modifier and Type | Field and Description |
---|---|
protected RecordReader |
TransformProcessRecordReader.recordReader |
Constructor and Description |
---|
TransformProcessRecordReader(RecordReader recordReader,
TransformProcess transformProcess) |
Modifier and Type | Method and Description |
---|---|
static List<String> |
TransformProcess.inferCategories(RecordReader recordReader,
int columnIndex)
Infer the categories for the given record reader for a particular column
Note that each "column index" is a column in the context of:
List
|
static Map<Integer,List<String>> |
TransformProcess.inferCategories(RecordReader recordReader,
int[] columnIndices)
Infer the categories for the given record reader for
a particular set of columns (this is more efficient than
TransformProcess.inferCategories(RecordReader, int)
if you have more than one column you plan on inferring categories for)
Note that each "column index" is a column in the context of:
List |
Modifier and Type | Method and Description |
---|---|
void |
Vectorizer.fit(RecordReader reader)
Fit based on a record reader
|
void |
Vectorizer.fit(RecordReader reader,
Vectorizer.RecordCallBack callBack)
Fit based on a record reader
|
VECTOR_TYPE |
Vectorizer.fitTransform(RecordReader reader)
Fit based on a record reader
|
VECTOR_TYPE |
Vectorizer.fitTransform(RecordReader reader,
Vectorizer.RecordCallBack callBack)
Fit based on a record reader
|
Modifier and Type | Class and Description |
---|---|
class |
ArrowRecordReader
Implements a record reader using arrow.
|
Modifier and Type | Method and Description |
---|---|
RecordReader |
WavInputFormat.createReader(InputSplit split) |
RecordReader |
WavInputFormat.createReader(InputSplit split,
Configuration conf) |
Modifier and Type | Class and Description |
---|---|
class |
BaseAudioRecordReader
Base audio file loader
|
class |
NativeAudioRecordReader
Native audio file loader using FFmpeg.
|
class |
WavFileRecordReader
Wav file loader
|
Modifier and Type | Method and Description |
---|---|
RecordReader |
CodecInputFormat.createReader(InputSplit split,
Configuration conf) |
Modifier and Type | Class and Description |
---|---|
class |
BaseCodecRecordReader
Codec record reader for parsing videos
|
class |
CodecRecordReader
Codec record reader for parsing:
H.264 ( AVC ) Main profile decoder MP3 decoder/encoder
Apple ProRes decoder and encoder AAC encoder
H264 Baseline profile encoder
Matroska ( MKV ) demuxer and muxer
MP4 ( ISO BMF, QuickTime ) demuxer/muxer and tools
MPEG 1/2 decoder ( supports interlace )
MPEG PS/TS demuxer
Java player applet
VP8 encoder
MXF demuxer
Credit to jcodec for the underlying parser
|
class |
NativeCodecRecordReader
An implementation of the CodecRecordReader that uses JavaCV and FFmpeg.
|
Modifier and Type | Class and Description |
---|---|
class |
MapFileRecordReader
A
RecordReader implementation for reading from a Hadoop MapFile A typical use case is with TransformProcess executed on Spark (perhaps Spark
local), followed by non-distributed training on a single machine. |
class |
MapFileSequenceRecordReader
A
SequenceRecordReader implementation for reading from a Hadoop MapFile A typical use case is with TransformProcess executed on Spark (perhaps Spark
local), followed by non-distributed training on a single machine. |
Modifier and Type | Method and Description |
---|---|
RecordReader |
ImageInputFormat.createReader(InputSplit split) |
RecordReader |
ImageInputFormat.createReader(InputSplit split,
Configuration conf) |
Modifier and Type | Method and Description |
---|---|
RecordReader |
LFWLoader.getRecordReader(long numExamples) |
RecordReader |
LFWLoader.getRecordReader(long batchSize,
long numExamples,
boolean train,
double splitTrainTest) |
RecordReader |
LFWLoader.getRecordReader(long batchSize,
long numExamples,
int[] imgDim,
boolean train,
double splitTrainTest,
Random rng) |
RecordReader |
LFWLoader.getRecordReader(long batchSize,
long numExamples,
int[] imgDim,
long numLabels,
PathLabelGenerator labelGenerator,
boolean train,
double splitTrainTest,
Random rng) |
RecordReader |
LFWLoader.getRecordReader(long batchSize,
long numExamples,
int[] imgDim,
PathLabelGenerator labelGenerator,
boolean train,
double splitTrainTest,
Random rng) |
RecordReader |
LFWLoader.getRecordReader(long batchSize,
long numExamples,
long[] imgDim,
boolean train,
double splitTrainTest,
Random rng) |
RecordReader |
LFWLoader.getRecordReader(long batchSize,
long numExamples,
long[] imgDim,
long numLabels,
PathLabelGenerator labelGenerator,
boolean train,
double splitTrainTest,
Random rng) |
RecordReader |
LFWLoader.getRecordReader(long batchSize,
long numExamples,
long[] imgDim,
PathLabelGenerator labelGenerator,
boolean train,
double splitTrainTest,
Random rng) |
RecordReader |
LFWLoader.getRecordReader(long batchSize,
long numExamples,
long numLabels,
Random rng) |
RecordReader |
LFWLoader.getRecordReader(long batchSize,
long numExamples,
PathLabelGenerator labelGenerator,
boolean train,
double splitTrainTest,
Random rng) |
Modifier and Type | Class and Description |
---|---|
class |
BaseImageRecordReader
Base class for the image record reader
|
class |
ImageRecordReader
Image record reader.
|
Modifier and Type | Class and Description |
---|---|
class |
ObjectDetectionRecordReader
An image record reader for object detection.
|
Modifier and Type | Class and Description |
---|---|
class |
JDBCRecordReader
Iterate on rows from a JDBC datasource and return corresponding records
|
Modifier and Type | Class and Description |
---|---|
class |
LocalTransformProcessRecordReader
A wrapper around the
TransformProcessRecordReader
that uses the LocalTransformExecutor
instead of the TransformProcess methods. |
class |
LocalTransformProcessSequenceRecordReader |
Modifier and Type | Method and Description |
---|---|
static DataAnalysis |
AnalyzeLocal.analyze(Schema schema,
RecordReader rr)
Analyse the specified data - returns a DataAnalysis object with summary information about each column
|
static DataAnalysis |
AnalyzeLocal.analyze(Schema schema,
RecordReader rr,
int maxHistogramBuckets)
Analyse the specified data - returns a DataAnalysis object with summary information about each column
|
static DataQualityAnalysis |
AnalyzeLocal.analyzeQuality(Schema schema,
RecordReader data)
Analyze the data quality of data - provides a report on missing values, values that don't comply with schema, etc
|
static Map<String,Set<Writable>> |
AnalyzeLocal.getUnique(List<String> columnNames,
Schema schema,
RecordReader data)
Get a list of unique values from the specified columns.
|
static Set<Writable> |
AnalyzeLocal.getUnique(String columnName,
Schema schema,
RecordReader data)
Get a list of unique values from the specified columns.
|
Constructor and Description |
---|
LocalTransformProcessRecordReader(RecordReader recordReader,
TransformProcess transformProcess)
Initialize with the internal record reader
and the transform process.
|
Modifier and Type | Field and Description |
---|---|
protected RecordReader |
RecordReaderFunction.recordReader |
Constructor and Description |
---|
LineRecordReaderFunction(RecordReader recordReader) |
RecordReaderFunction(RecordReader recordReader) |
Constructor and Description |
---|
RecordReaderBytesFunction(RecordReader recordReader) |
Modifier and Type | Method and Description |
---|---|
RecordReader |
TextInputFormat.createReader(InputSplit split,
Configuration conf) |
Modifier and Type | Class and Description |
---|---|
class |
TfidfRecordReader
TFIDF record reader (wraps a tfidf vectorizer
for delivering labels and conforming to the record reader interface)
|
Modifier and Type | Method and Description |
---|---|
void |
TextVectorizer.fit(RecordReader reader) |
void |
TextVectorizer.fit(RecordReader reader,
Vectorizer.RecordCallBack callBack) |
abstract VECTOR_TYPE |
AbstractTfidfVectorizer.fitTransform(RecordReader reader) |
INDArray |
TfidfVectorizer.fitTransform(RecordReader reader) |
INDArray |
TfidfVectorizer.fitTransform(RecordReader reader,
Vectorizer.RecordCallBack callBack) |
Modifier and Type | Class and Description |
---|---|
class |
ExcelRecordReader
Excel record reader for loading rows of an excel spreadsheet
from multiple spreadsheets very similar to the
CSVRecordReader
Of note when you have multiple sheets, you must have the same number of
lines skipped at the top. |
Modifier and Type | Field and Description |
---|---|
protected RecordReader |
RecordReaderFunction.recordReader |
Constructor and Description |
---|
LineRecordReaderFunction(RecordReader recordReader) |
RecordReaderFunction(RecordReader recordReader) |
Constructor and Description |
---|
RecordReaderBytesFunction(RecordReader recordReader) |
Constructor and Description |
---|
RecordReaderFileBatchLoader(RecordReader recordReader,
int batchSize,
int labelIndex,
int numClasses)
Main constructor for classification.
|
RecordReaderFileBatchLoader(RecordReader recordReader,
int batchSize,
int labelIndexFrom,
int labelIndexTo,
boolean regression)
Main constructor for multi-label regression (i.e., regression with multiple outputs).
|
RecordReaderFileBatchLoader(RecordReader recordReader,
int batchSize,
int labelIndexFrom,
int labelIndexTo,
int numPossibleLabels,
boolean regression,
DataSetPreProcessor preProcessor)
Main constructor
|
Modifier and Type | Field and Description |
---|---|
protected RecordReader |
RecordReaderDataSetIterator.recordReader |
protected RecordReader |
RecordReaderDataSetIterator.Builder.recordReader |
Modifier and Type | Method and Description |
---|---|
RecordReaderMultiDataSetIterator.Builder |
RecordReaderMultiDataSetIterator.Builder.addReader(String readerName,
RecordReader recordReader)
Add a RecordReader for use in .addInput(...) or .addOutput(...)
|
Constructor and Description |
---|
Builder(@NonNull RecordReader rr,
int batchSize) |
RecordReaderDataSetIterator(RecordReader recordReader,
int batchSize)
Constructor for classification, where:
(a) the label index is assumed to be the very last Writable/column, and (b) the number of classes is inferred from RecordReader.getLabels() Note that if RecordReader.getLabels() returns null, no output labels will be produced |
RecordReaderDataSetIterator(RecordReader recordReader,
int batchSize,
int labelIndex,
int numPossibleLabels)
Main constructor for classification.
|
RecordReaderDataSetIterator(RecordReader recordReader,
int batchSize,
int labelIndexFrom,
int labelIndexTo,
boolean regression)
Main constructor for multi-label regression (i.e., regression with multiple outputs).
|
RecordReaderDataSetIterator(RecordReader recordReader,
int batchSize,
int labelIndex,
int numPossibleLabels,
int maxNumBatches)
Constructor for classification, where the maximum number of returned batches is limited to the specified value
|
RecordReaderDataSetIterator(RecordReader recordReader,
WritableConverter converter,
int batchSize,
int labelIndexFrom,
int labelIndexTo,
int numPossibleLabels,
int maxNumBatches,
boolean regression)
Main constructor
|
Modifier and Type | Method and Description |
---|---|
RecordReader |
Cifar10Fetcher.getRecordReader(long rngSeed,
int[] imgDim,
DataSetType set,
ImageTransform imageTransform) |
RecordReader |
SvhnDataFetcher.getRecordReader(long rngSeed,
int[] imgDim,
DataSetType set,
ImageTransform imageTransform) |
RecordReader |
TinyImageNetFetcher.getRecordReader(long rngSeed,
int[] imgDim,
DataSetType set,
ImageTransform imageTransform) |
Constructor and Description |
---|
RecordReaderFunction(RecordReader recordReader,
int labelIndex,
int numPossibleLabels) |
RecordReaderFunction(RecordReader recordReader,
int labelIndex,
int numPossibleLabels,
WritableConverter converter) |
Constructor and Description |
---|
StringToDataSetExportFunction(URI outputDir,
RecordReader recordReader,
int batchSize,
boolean regression,
int labelIndex,
int numPossibleLabels) |
StringToDataSetExportFunction(URI outputDir,
RecordReader recordReader,
int batchSize,
boolean regression,
int labelIndex,
int numPossibleLabels,
org.apache.spark.broadcast.Broadcast<SerializableHadoopConfig> configuration) |
Modifier and Type | Class and Description |
---|---|
class |
SparkSourceDummyReader
Dummy reader for use in
IteratorUtils |
class |
SparkSourceDummySeqReader |
Modifier and Type | Method and Description |
---|---|
static org.apache.spark.api.java.JavaRDD<org.apache.spark.mllib.regression.LabeledPoint> |
MLLibUtil.fromBinary(org.apache.spark.api.java.JavaPairRDD<String,org.apache.spark.input.PortableDataStream> binaryFiles,
RecordReader reader)
Convert a traditional sc.binaryFiles
in to something usable for machine learning
|
static org.apache.spark.api.java.JavaRDD<org.apache.spark.mllib.regression.LabeledPoint> |
MLLibUtil.fromBinary(org.apache.spark.api.java.JavaRDD<scala.Tuple2<String,org.apache.spark.input.PortableDataStream>> binaryFiles,
RecordReader reader)
Convert a traditional sc.binaryFiles
in to something usable for machine learning
|
Copyright © 2020. All rights reserved.