public class RegexLineRecordReader extends LineRecordReader
Pattern
and Matcher
.Modifier and Type | Field and Description |
---|---|
static String |
SKIP_NUM_LINES |
charset, conf, initialized, lineIndex, locations, splitIndex
inputSplit, listeners, streamCreatorFn
APPEND_LABEL, LABELS, NAME_SPACE
Constructor and Description |
---|
RegexLineRecordReader(String regex,
int skipNumLines) |
Modifier and Type | Method and Description |
---|---|
void |
initialize(Configuration conf,
InputSplit split)
Called once at initialization.
|
List<Record> |
loadFromMetaData(List<RecordMetaData> recordMetaDatas)
Load multiple records from the given a list of
RecordMetaData instances |
Record |
loadFromMetaData(RecordMetaData recordMetaData)
Load a single record from the given
RecordMetaData instanceNote: that for data that isn't splittable (i.e., text data that needs to be scanned/split), it is more efficient to load multiple records at once using RecordReader.loadFromMetaData(List) |
List<Writable> |
next()
Get the next record
|
Record |
nextRecord()
Similar to
RecordReader.next() , but returns a Record object, that may include metadata such as the source
of the data |
List<Writable> |
record(URI uri,
DataInputStream dataInputStream)
Load the record from the given DataInputStream
Unlike
RecordReader.next() the internal state of the RecordReader is not modified
Implementations of this method should not close the DataInputStream |
void |
reset()
Reset record reader iterator
|
close, closeIfRequired, getConf, getIterator, getLabels, hasNext, initialize, onLocationOpen, resetSupported, setConf
batchesSupported, getListeners, invokeListeners, next, setListeners, setListeners
public static final String SKIP_NUM_LINES
public RegexLineRecordReader(String regex, int skipNumLines)
public void initialize(Configuration conf, InputSplit split) throws IOException, InterruptedException
RecordReader
initialize
in interface RecordReader
initialize
in class LineRecordReader
conf
- a configuration for initializationsplit
- the split that defines the range of records to readIOException
InterruptedException
public List<Writable> next()
RecordReader
next
in interface RecordReader
next
in class LineRecordReader
public List<Writable> record(URI uri, DataInputStream dataInputStream) throws IOException
RecordReader
RecordReader.next()
the internal state of the RecordReader is not modified
Implementations of this method should not close the DataInputStreamrecord
in interface RecordReader
record
in class LineRecordReader
IOException
- if error occurs during reading from the input streampublic void reset()
RecordReader
reset
in interface RecordReader
reset
in class LineRecordReader
public Record nextRecord()
RecordReader
RecordReader.next()
, but returns a Record
object, that may include metadata such as the source
of the datanextRecord
in interface RecordReader
nextRecord
in class LineRecordReader
public Record loadFromMetaData(RecordMetaData recordMetaData) throws IOException
RecordReader
RecordMetaData
instanceRecordReader.loadFromMetaData(List)
loadFromMetaData
in interface RecordReader
loadFromMetaData
in class LineRecordReader
recordMetaData
- Metadata for the record that we want to load fromIOException
- If I/O error occurs during loadingpublic List<Record> loadFromMetaData(List<RecordMetaData> recordMetaDatas) throws IOException
RecordReader
RecordMetaData
instancesloadFromMetaData
in interface RecordReader
loadFromMetaData
in class LineRecordReader
recordMetaDatas
- Metadata for the records that we want to load fromIOException
- If I/O error occurs during loadingCopyright © 2020. All rights reserved.