Package | Description |
---|---|
org.datavec.spark.functions.pairdata | |
org.datavec.spark.util |
Modifier and Type | Method and Description |
---|---|
scala.Tuple2<org.apache.hadoop.io.Text,BytesPairWritable> |
MapToBytesPairWritableFunction.call(scala.Tuple2<String,Iterable<scala.Tuple3<String,Integer,org.apache.spark.input.PortableDataStream>>> in) |
Modifier and Type | Method and Description |
---|---|
scala.Tuple2<List<List<Writable>>,List<List<Writable>>> |
PairSequenceRecordReaderBytesFunction.call(scala.Tuple2<org.apache.hadoop.io.Text,BytesPairWritable> v1) |
Modifier and Type | Method and Description |
---|---|
static org.apache.spark.api.java.JavaPairRDD<org.apache.hadoop.io.Text,BytesPairWritable> |
DataVecSparkUtil.combineFilesForSequenceFile(org.apache.spark.api.java.JavaSparkContext sc,
String path1,
String path2,
PathToKeyConverter converter)
Same as
DataVecSparkUtil.combineFilesForSequenceFile(JavaSparkContext, String, String, PathToKeyConverter, PathToKeyConverter)
but with the PathToKeyConverter used for both file sources |
static org.apache.spark.api.java.JavaPairRDD<org.apache.hadoop.io.Text,BytesPairWritable> |
DataVecSparkUtil.combineFilesForSequenceFile(org.apache.spark.api.java.JavaSparkContext sc,
String path1,
String path2,
PathToKeyConverter converter1,
PathToKeyConverter converter2)
This is a convenience method to combine data from separate files together (intended to write to a sequence file, using
JavaPairRDD.saveAsNewAPIHadoopFile(String, Class, Class, Class) )A typical use case is to combine input and label data from different files, for later parsing by a RecordReader or SequenceRecordReader. |
Copyright © 2020. All rights reserved.