public class SparkSourceDummySeqReader extends SparkSourceDummyReader implements SequenceRecordReader
APPEND_LABEL, LABELS, NAME_SPACE
Constructor and Description |
---|
SparkSourceDummySeqReader(int readerIdx) |
Modifier and Type | Method and Description |
---|---|
java.util.List<SequenceRecord> |
loadSequenceFromMetaData(java.util.List<RecordMetaData> list)
Load multiple sequence records from the given a list of
RecordMetaData instances |
SequenceRecord |
loadSequenceFromMetaData(RecordMetaData recordMetaData)
Load a single sequence record from the given
RecordMetaData instanceNote: that for data that isn't splittable (i.e., text data that needs to be scanned/split), it is more efficient to load multiple records at once using SequenceRecordReader.loadSequenceFromMetaData(List) |
SequenceRecord |
nextSequence()
Similar to
SequenceRecordReader.sequenceRecord() , but returns a Record object, that may include metadata such as the source
of the data |
java.util.List<java.util.List<Writable>> |
sequenceRecord()
Returns a sequence record.
|
java.util.List<java.util.List<Writable>> |
sequenceRecord(java.net.URI uri,
java.io.DataInputStream dataInputStream)
Load a sequence record from the given DataInputStream
Unlike
RecordReader.next() the internal state of the RecordReader is not modified
Implementations of this method should not close the DataInputStream |
batchesSupported, close, getConf, getLabels, getListeners, hasNext, initialize, initialize, loadFromMetaData, loadFromMetaData, next, next, nextRecord, record, reset, resetSupported, setConf, setListeners, setListeners
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
batchesSupported, getLabels, getListeners, hasNext, initialize, initialize, loadFromMetaData, loadFromMetaData, next, next, nextRecord, record, reset, resetSupported, setListeners, setListeners
getConf, setConf
public SparkSourceDummySeqReader(int readerIdx)
readerIdx
- Index of the reader, in terms of the sequence RDD that it should use. For a single sequence RDD
as input, this is always 0; for 2 sequence RDDs used as input, this would be 0 or 1, depending
on whether it should pull values from the first or second sequence RDD. Note that the indexing
for sequence RDDs doesn't depend on the presence of non-sequence RDDs - they are indexed separately.public java.util.List<java.util.List<Writable>> sequenceRecord()
SequenceRecordReader
sequenceRecord
in interface SequenceRecordReader
public java.util.List<java.util.List<Writable>> sequenceRecord(java.net.URI uri, java.io.DataInputStream dataInputStream) throws java.io.IOException
SequenceRecordReader
RecordReader.next()
the internal state of the RecordReader is not modified
Implementations of this method should not close the DataInputStreamsequenceRecord
in interface SequenceRecordReader
java.io.IOException
- if error occurs during reading from the input streampublic SequenceRecord nextSequence()
SequenceRecordReader
SequenceRecordReader.sequenceRecord()
, but returns a Record
object, that may include metadata such as the source
of the datanextSequence
in interface SequenceRecordReader
public SequenceRecord loadSequenceFromMetaData(RecordMetaData recordMetaData) throws java.io.IOException
SequenceRecordReader
RecordMetaData
instanceSequenceRecordReader.loadSequenceFromMetaData(List)
loadSequenceFromMetaData
in interface SequenceRecordReader
recordMetaData
- Metadata for the sequence record that we want to load fromjava.io.IOException
- If I/O error occurs during loadingpublic java.util.List<SequenceRecord> loadSequenceFromMetaData(java.util.List<RecordMetaData> list) throws java.io.IOException
SequenceRecordReader
RecordMetaData
instancesloadSequenceFromMetaData
in interface SequenceRecordReader
list
- Metadata for the records that we want to load fromjava.io.IOException
- If I/O error occurs during loading