Modifier and Type | Method and Description |
---|---|
Schema |
TransformProcess.getFinalSchema()
Get the Schema of the output data, after executing the process
|
Schema |
ColumnOp.getInputSchema()
Getter for input schema
|
Schema |
DataAction.getSchema() |
Schema |
TransformProcess.getSchemaAfterStep(int step)
Return the schema after executing all steps up to and including the specified step.
|
Modifier and Type | Method and Description |
---|---|
void |
ColumnOp.setInputSchema(Schema inputSchema)
Set the input schema.
|
Constructor and Description |
---|
Builder(Schema initialSchema) |
TransformProcess(Schema initialSchema,
List<DataAction> actionList) |
Constructor and Description |
---|
SequenceDataAnalysis(Schema schema,
List<ColumnAnalysis> columnAnalysis,
SequenceLengthAnalysis sequenceAnalysis) |
Modifier and Type | Method and Description |
---|---|
Schema |
BooleanCondition.getInputSchema() |
Schema |
Condition.getInputSchema()
Getter for the input schema
|
Schema |
BooleanCondition.transform(Schema inputSchema)
Get the output schema for this transformation, given an input schema
|
Modifier and Type | Method and Description |
---|---|
void |
BooleanCondition.setInputSchema(Schema schema) |
void |
Condition.setInputSchema(Schema schema)
Setter for the input schema
|
Schema |
BooleanCondition.transform(Schema inputSchema)
Get the output schema for this transformation, given an input schema
|
Modifier and Type | Field and Description |
---|---|
protected Schema |
BaseColumnCondition.schema |
Modifier and Type | Method and Description |
---|---|
Schema |
BaseColumnCondition.getInputSchema() |
Schema |
ColumnCondition.getInputSchema() |
Schema |
BaseColumnCondition.transform(Schema inputSchema)
Get the output schema for this transformation, given an input schema
|
Schema |
ColumnCondition.transform(Schema inputSchema)
Get the output schema for this transformation, given an input schema
|
Modifier and Type | Method and Description |
---|---|
void |
BaseColumnCondition.setInputSchema(Schema schema) |
void |
ColumnCondition.setInputSchema(Schema schema) |
Schema |
BaseColumnCondition.transform(Schema inputSchema)
Get the output schema for this transformation, given an input schema
|
Schema |
ColumnCondition.transform(Schema inputSchema)
Get the output schema for this transformation, given an input schema
|
Modifier and Type | Method and Description |
---|---|
Schema |
SequenceLengthCondition.getInputSchema() |
Schema |
SequenceLengthCondition.transform(Schema inputSchema) |
Modifier and Type | Method and Description |
---|---|
void |
SequenceLengthCondition.setInputSchema(Schema schema) |
Schema |
SequenceLengthCondition.transform(Schema inputSchema) |
Modifier and Type | Field and Description |
---|---|
protected Schema |
BaseColumnFilter.schema |
Modifier and Type | Method and Description |
---|---|
Schema |
ConditionFilter.getInputSchema() |
Schema |
Filter.getInputSchema() |
Schema |
FilterInvalidValues.getInputSchema() |
Schema |
InvalidNumColumns.getInputSchema() |
Schema |
ConditionFilter.transform(Schema inputSchema)
Get the output schema for this transformation, given an input schema
|
Schema |
FilterInvalidValues.transform(Schema inputSchema)
Get the output schema for this transformation, given an input schema
|
Schema |
InvalidNumColumns.transform(Schema inputSchema)
Get the output schema for this transformation, given an input schema
|
Modifier and Type | Method and Description |
---|---|
void |
BaseColumnFilter.setInputSchema(Schema schema) |
void |
ConditionFilter.setInputSchema(Schema schema) |
void |
Filter.setInputSchema(Schema schema) |
void |
FilterInvalidValues.setInputSchema(Schema schema) |
void |
InvalidNumColumns.setInputSchema(Schema schema) |
Schema |
ConditionFilter.transform(Schema inputSchema)
Get the output schema for this transformation, given an input schema
|
Schema |
FilterInvalidValues.transform(Schema inputSchema)
Get the output schema for this transformation, given an input schema
|
Schema |
InvalidNumColumns.transform(Schema inputSchema)
Get the output schema for this transformation, given an input schema
|
Modifier and Type | Method and Description |
---|---|
Schema |
Join.getOutputSchema() |
Modifier and Type | Method and Description |
---|---|
Join.Builder |
Join.Builder.setSchemas(Schema left,
Schema right) |
Modifier and Type | Method and Description |
---|---|
Schema |
NDArrayDistanceTransform.transform(Schema inputSchema) |
Modifier and Type | Method and Description |
---|---|
protected ColumnMetaData |
NDArrayColumnsMathOpTransform.derivedColumnMetaData(String newColumnName,
Schema inputSchema) |
void |
NDArrayDistanceTransform.setInputSchema(Schema inputSchema) |
Schema |
NDArrayDistanceTransform.transform(Schema inputSchema) |
Modifier and Type | Method and Description |
---|---|
Schema |
CalculateSortedRank.getInputSchema() |
Schema |
CalculateSortedRank.transform(Schema inputSchema) |
Modifier and Type | Method and Description |
---|---|
void |
CalculateSortedRank.setInputSchema(Schema schema) |
Schema |
CalculateSortedRank.transform(Schema inputSchema) |
Modifier and Type | Method and Description |
---|---|
Schema |
IAssociativeReducer.getInputSchema() |
Schema |
Reducer.getInputSchema() |
Schema |
IAssociativeReducer.transform(Schema schema) |
Schema |
Reducer.transform(Schema schema)
Get the output schema, given the input schema
|
Modifier and Type | Method and Description |
---|---|
void |
IAssociativeReducer.setInputSchema(Schema schema) |
void |
Reducer.setInputSchema(Schema schema) |
Schema |
IAssociativeReducer.transform(Schema schema) |
Schema |
Reducer.transform(Schema schema)
Get the output schema, given the input schema
|
Modifier and Type | Method and Description |
---|---|
Schema |
CoordinatesReduction.getInputSchema() |
Schema |
CoordinatesReduction.transform(Schema inputSchema) |
Modifier and Type | Method and Description |
---|---|
void |
CoordinatesReduction.setInputSchema(Schema inputSchema) |
Schema |
CoordinatesReduction.transform(Schema inputSchema) |
Modifier and Type | Method and Description |
---|---|
Schema |
GeographicMidpointReduction.getInputSchema() |
Schema |
GeographicMidpointReduction.transform(Schema inputSchema) |
Modifier and Type | Method and Description |
---|---|
void |
GeographicMidpointReduction.setInputSchema(Schema inputSchema) |
Schema |
GeographicMidpointReduction.transform(Schema inputSchema) |
Modifier and Type | Class and Description |
---|---|
class |
SequenceSchema
A SequenceSchema is a
Schema for sequential data. |
Modifier and Type | Method and Description |
---|---|
Schema |
InferredSchema.build() |
Schema |
Schema.Builder.build()
Create the Schema
|
static Schema |
Schema.fromJson(String json)
Create a schema from a given json string
|
static Schema |
Schema.fromYaml(String yaml)
Create a schema from the given
yaml string
|
static Schema |
Schema.infer(List<Writable> record)
Infers a schema based on the record.
|
static Schema |
Schema.inferMultiple(List<List<Writable>> record)
Infers a schema based on the record.
|
Schema |
Schema.newSchema(List<ColumnMetaData> columnMetaData)
Create a new schema based on the new metadata
|
Modifier and Type | Method and Description |
---|---|
List<ColumnMetaData> |
Schema.differences(Schema schema)
Compute the difference in
ColumnMetaData
between this schema and the passed in schema. |
boolean |
Schema.sameTypes(Schema schema)
Returns true if the given schema
has the same types at each index
|
Modifier and Type | Method and Description |
---|---|
Schema |
ReduceSequenceTransform.getInputSchema() |
Schema |
SequenceSplit.getInputSchema()
Getter for the input schema
|
Schema |
ReduceSequenceTransform.transform(Schema inputSchema) |
Schema |
ConvertFromSequence.transform(SequenceSchema schema) |
Modifier and Type | Method and Description |
---|---|
void |
ConvertToSequence.setInputSchema(Schema schema) |
void |
ReduceSequenceTransform.setInputSchema(Schema inputSchema) |
void |
SequenceSplit.setInputSchema(Schema inputSchema)
Sets the input schema for this split
|
void |
SequenceComparator.setSchema(Schema sequenceSchema) |
SequenceSchema |
ConvertToSequence.transform(Schema schema) |
Schema |
ReduceSequenceTransform.transform(Schema inputSchema) |
Modifier and Type | Field and Description |
---|---|
protected Schema |
BaseColumnComparator.schema |
Modifier and Type | Method and Description |
---|---|
Schema |
BaseColumnComparator.getInputSchema()
Getter for input schema
|
Schema |
BaseColumnComparator.transform(Schema inputSchema)
Get the output schema for this transformation, given an input schema
|
Modifier and Type | Method and Description |
---|---|
void |
BaseColumnComparator.setInputSchema(Schema inputSchema)
Set the input schema.
|
void |
BaseColumnComparator.setSchema(Schema sequenceSchema) |
void |
NumericalColumnComparator.setSchema(Schema sequenceSchema) |
Schema |
BaseColumnComparator.transform(Schema inputSchema)
Get the output schema for this transformation, given an input schema
|
Modifier and Type | Field and Description |
---|---|
protected Schema |
BaseSequenceExpansionTransform.inputSchema |
Modifier and Type | Method and Description |
---|---|
Schema |
BaseSequenceExpansionTransform.transform(Schema inputSchema) |
Modifier and Type | Method and Description |
---|---|
Schema |
BaseSequenceExpansionTransform.transform(Schema inputSchema) |
Modifier and Type | Method and Description |
---|---|
Schema |
SequenceSplitTimeSeparation.getInputSchema() |
Schema |
SplitMaxLengthSequence.getInputSchema() |
Modifier and Type | Method and Description |
---|---|
void |
SequenceSplitTimeSeparation.setInputSchema(Schema inputSchema) |
void |
SplitMaxLengthSequence.setInputSchema(Schema inputSchema) |
Modifier and Type | Method and Description |
---|---|
Schema |
SequenceTrimToLengthTransform.getInputSchema() |
Schema |
SequenceTrimTransform.getInputSchema() |
Schema |
SequenceTrimToLengthTransform.transform(Schema inputSchema) |
Schema |
SequenceTrimTransform.transform(Schema inputSchema) |
Modifier and Type | Method and Description |
---|---|
void |
SequenceTrimToLengthTransform.setInputSchema(Schema inputSchema) |
void |
SequenceTrimTransform.setInputSchema(Schema inputSchema) |
Schema |
SequenceTrimToLengthTransform.transform(Schema inputSchema) |
Schema |
SequenceTrimTransform.transform(Schema inputSchema) |
Modifier and Type | Method and Description |
---|---|
Schema |
OverlappingTimeWindowFunction.getInputSchema() |
Schema |
ReduceSequenceByWindowTransform.getInputSchema() |
Schema |
TimeWindowFunction.getInputSchema() |
Schema |
WindowFunction.getInputSchema() |
Schema |
OverlappingTimeWindowFunction.transform(Schema inputSchema) |
Schema |
ReduceSequenceByWindowTransform.transform(Schema inputSchema) |
Schema |
TimeWindowFunction.transform(Schema inputSchema) |
Schema |
WindowFunction.transform(Schema inputSchema)
Get the output schema, given the input schema.
|
Modifier and Type | Method and Description |
---|---|
void |
OverlappingTimeWindowFunction.setInputSchema(Schema schema) |
void |
ReduceSequenceByWindowTransform.setInputSchema(Schema inputSchema) |
void |
TimeWindowFunction.setInputSchema(Schema schema) |
void |
WindowFunction.setInputSchema(Schema schema) |
Schema |
OverlappingTimeWindowFunction.transform(Schema inputSchema) |
Schema |
ReduceSequenceByWindowTransform.transform(Schema inputSchema) |
Schema |
TimeWindowFunction.transform(Schema inputSchema) |
Schema |
WindowFunction.transform(Schema inputSchema)
Get the output schema, given the input schema.
|
Modifier and Type | Method and Description |
---|---|
Schema |
IStringReducer.getInputSchema() |
Schema |
StringReducer.getInputSchema() |
Schema |
IStringReducer.transform(Schema schema) |
Schema |
StringReducer.transform(Schema schema)
Get the output schema, given the input schema
|
Modifier and Type | Method and Description |
---|---|
void |
IStringReducer.setInputSchema(Schema schema) |
void |
StringReducer.setInputSchema(Schema schema) |
Schema |
IStringReducer.transform(Schema schema) |
Schema |
StringReducer.transform(Schema schema)
Get the output schema, given the input schema
|
Modifier and Type | Field and Description |
---|---|
protected Schema |
BaseTransform.inputSchema |
Modifier and Type | Method and Description |
---|---|
Schema |
BaseColumnsMathOpTransform.getInputSchema() |
Schema |
BaseTransform.getInputSchema() |
Schema |
BaseColumnsMathOpTransform.transform(Schema inputSchema) |
Schema |
BaseColumnTransform.transform(Schema schema) |
Modifier and Type | Method and Description |
---|---|
protected abstract ColumnMetaData |
BaseColumnsMathOpTransform.derivedColumnMetaData(String newColumnName,
Schema inputSchema) |
void |
BaseColumnsMathOpTransform.setInputSchema(Schema inputSchema) |
void |
BaseColumnTransform.setInputSchema(Schema inputSchema) |
void |
BaseTransform.setInputSchema(Schema inputSchema) |
Schema |
BaseColumnsMathOpTransform.transform(Schema inputSchema) |
Schema |
BaseColumnTransform.transform(Schema schema) |
Modifier and Type | Method and Description |
---|---|
Schema |
CategoricalToIntegerTransform.transform(Schema schema) |
Schema |
CategoricalToOneHotTransform.transform(Schema schema) |
Schema |
FirstDigitTransform.transform(Schema inputSchema) |
Schema |
PivotTransform.transform(Schema inputSchema) |
Modifier and Type | Method and Description |
---|---|
void |
CategoricalToIntegerTransform.setInputSchema(Schema inputSchema) |
void |
CategoricalToOneHotTransform.setInputSchema(Schema inputSchema) |
void |
FirstDigitTransform.setInputSchema(Schema schema) |
Schema |
CategoricalToIntegerTransform.transform(Schema schema) |
Schema |
CategoricalToOneHotTransform.transform(Schema schema) |
Schema |
FirstDigitTransform.transform(Schema inputSchema) |
Schema |
PivotTransform.transform(Schema inputSchema) |
Modifier and Type | Method and Description |
---|---|
Schema |
AddConstantColumnTransform.getInputSchema() |
Schema |
DuplicateColumnsTransform.getInputSchema() |
Schema |
RenameColumnsTransform.getInputSchema() |
Schema |
ReorderColumnsTransform.getInputSchema() |
Schema |
AddConstantColumnTransform.transform(Schema inputSchema) |
Schema |
DuplicateColumnsTransform.transform(Schema inputSchema) |
Schema |
RemoveAllColumnsExceptForTransform.transform(Schema schema) |
Schema |
RemoveColumnsTransform.transform(Schema schema) |
Schema |
RenameColumnsTransform.transform(Schema inputSchema) |
Schema |
ReorderColumnsTransform.transform(Schema inputSchema) |
Modifier and Type | Method and Description |
---|---|
void |
AddConstantColumnTransform.setInputSchema(Schema inputSchema) |
void |
DuplicateColumnsTransform.setInputSchema(Schema inputSchema) |
void |
RemoveAllColumnsExceptForTransform.setInputSchema(Schema schema) |
void |
RemoveColumnsTransform.setInputSchema(Schema schema) |
void |
RenameColumnsTransform.setInputSchema(Schema inputSchema) |
void |
ReorderColumnsTransform.setInputSchema(Schema inputSchema) |
Schema |
AddConstantColumnTransform.transform(Schema inputSchema) |
Schema |
DuplicateColumnsTransform.transform(Schema inputSchema) |
Schema |
RemoveAllColumnsExceptForTransform.transform(Schema schema) |
Schema |
RemoveColumnsTransform.transform(Schema schema) |
Schema |
RenameColumnsTransform.transform(Schema inputSchema) |
Schema |
ReorderColumnsTransform.transform(Schema inputSchema) |
Modifier and Type | Method and Description |
---|---|
Schema |
ConditionalCopyValueTransform.getInputSchema() |
Schema |
ConditionalReplaceValueTransform.getInputSchema() |
Schema |
ConditionalReplaceValueTransformWithDefault.getInputSchema() |
Schema |
ConditionalCopyValueTransform.transform(Schema inputSchema) |
Schema |
ConditionalReplaceValueTransform.transform(Schema inputSchema) |
Schema |
ConditionalReplaceValueTransformWithDefault.transform(Schema inputSchema) |
Modifier and Type | Method and Description |
---|---|
void |
ConditionalCopyValueTransform.setInputSchema(Schema inputSchema) |
void |
ConditionalReplaceValueTransform.setInputSchema(Schema inputSchema) |
void |
ConditionalReplaceValueTransformWithDefault.setInputSchema(Schema inputSchema) |
Schema |
ConditionalCopyValueTransform.transform(Schema inputSchema) |
Schema |
ConditionalReplaceValueTransform.transform(Schema inputSchema) |
Schema |
ConditionalReplaceValueTransformWithDefault.transform(Schema inputSchema) |
Modifier and Type | Method and Description |
---|---|
protected ColumnMetaData |
DoubleColumnsMathOpTransform.derivedColumnMetaData(String newColumnName,
Schema inputSchema) |
Modifier and Type | Method and Description |
---|---|
protected ColumnMetaData |
FloatColumnsMathOpTransform.derivedColumnMetaData(String newColumnName,
Schema inputSchema) |
Modifier and Type | Method and Description |
---|---|
protected ColumnMetaData |
CoordinatesDistanceTransform.derivedColumnMetaData(String newColumnName,
Schema inputSchema) |
Modifier and Type | Method and Description |
---|---|
Schema |
IntegerToOneHotTransform.transform(Schema schema) |
Modifier and Type | Method and Description |
---|---|
protected ColumnMetaData |
IntegerColumnsMathOpTransform.derivedColumnMetaData(String newColumnName,
Schema inputSchema) |
void |
IntegerToOneHotTransform.setInputSchema(Schema inputSchema) |
Schema |
IntegerToOneHotTransform.transform(Schema schema) |
Modifier and Type | Method and Description |
---|---|
protected ColumnMetaData |
LongColumnsMathOpTransform.derivedColumnMetaData(String newColumnName,
Schema inputSchema) |
Modifier and Type | Method and Description |
---|---|
Schema |
ParseDoubleTransform.transform(Schema inputSchema)
Get the output schema for this transformation, given an input schema
|
Modifier and Type | Method and Description |
---|---|
Schema |
ParseDoubleTransform.transform(Schema inputSchema)
Get the output schema for this transformation, given an input schema
|
Modifier and Type | Method and Description |
---|---|
Schema |
SequenceDifferenceTransform.getInputSchema() |
Schema |
SequenceMovingWindowReduceTransform.getInputSchema() |
Schema |
SequenceDifferenceTransform.transform(Schema inputSchema) |
Schema |
SequenceMovingWindowReduceTransform.transform(Schema inputSchema) |
Schema |
SequenceOffsetTransform.transform(Schema inputSchema) |
Modifier and Type | Method and Description |
---|---|
void |
SequenceDifferenceTransform.setInputSchema(Schema inputSchema) |
void |
SequenceMovingWindowReduceTransform.setInputSchema(Schema inputSchema) |
void |
SequenceOffsetTransform.setInputSchema(Schema inputSchema) |
Schema |
SequenceDifferenceTransform.transform(Schema inputSchema) |
Schema |
SequenceMovingWindowReduceTransform.transform(Schema inputSchema) |
Schema |
SequenceOffsetTransform.transform(Schema inputSchema) |
Modifier and Type | Method and Description |
---|---|
Schema |
ConcatenateStringColumns.getInputSchema() |
Schema |
ConcatenateStringColumns.transform(Schema inputSchema) |
Schema |
StringListToCategoricalSetTransform.transform(Schema inputSchema) |
Schema |
StringListToCountsNDArrayTransform.transform(Schema inputSchema) |
Modifier and Type | Method and Description |
---|---|
void |
ConcatenateStringColumns.setInputSchema(Schema inputSchema) |
void |
StringListToCategoricalSetTransform.setInputSchema(Schema inputSchema) |
void |
StringListToCountsNDArrayTransform.setInputSchema(Schema inputSchema) |
Schema |
ConcatenateStringColumns.transform(Schema inputSchema) |
Schema |
StringListToCategoricalSetTransform.transform(Schema inputSchema) |
Schema |
StringListToCountsNDArrayTransform.transform(Schema inputSchema) |
Modifier and Type | Method and Description |
---|---|
Schema |
DeriveColumnsFromTimeTransform.getInputSchema() |
Schema |
DeriveColumnsFromTimeTransform.transform(Schema inputSchema) |
Modifier and Type | Method and Description |
---|---|
void |
DeriveColumnsFromTimeTransform.setInputSchema(Schema inputSchema) |
Schema |
DeriveColumnsFromTimeTransform.transform(Schema inputSchema) |
Modifier and Type | Method and Description |
---|---|
static void |
HtmlSequencePlotting.createHtmlSequencePlotFile(String title,
Schema schema,
List<List<Writable>> sequence,
File output)
Create a HTML file with plots for the given sequence and write it to a file.
|
static String |
HtmlSequencePlotting.createHtmlSequencePlots(String title,
Schema schema,
List<List<Writable>> sequence)
Create a HTML file with plots for the given sequence.
|
Modifier and Type | Method and Description |
---|---|
static List<Writable> |
RecordConverter.toRecord(Schema schema,
List<Object> source)
Convert a collection into a `List
|
Modifier and Type | Method and Description |
---|---|
static Schema |
ArrowConverter.toDatavecSchema(org.apache.arrow.vector.types.pojo.Schema schema)
Convert an
Schema
to a datavec Schema |
Modifier and Type | Method and Description |
---|---|
static Pair<Schema,ArrowWritableRecordBatch> |
ArrowConverter.readFromBytes(byte[] input)
Read a datavec schema and record set
from the given bytes (usually expected to be an arrow format file)
|
static Pair<Schema,ArrowWritableRecordBatch> |
ArrowConverter.readFromFile(File input)
Read a datavec schema and record set
from the given arrow file.
|
static Pair<Schema,ArrowWritableRecordBatch> |
ArrowConverter.readFromFile(FileInputStream input)
Read a datavec schema and record set
from the given arrow file.
|
Modifier and Type | Method and Description |
---|---|
static List<org.apache.arrow.vector.FieldVector> |
ArrowConverter.toArrowColumns(org.apache.arrow.memory.BufferAllocator bufferAllocator,
Schema schema,
List<List<Writable>> dataVecRecord)
Given a buffer allocator and datavec schema,
convert the passed in batch of records
to a set of arrow columns
|
static List<org.apache.arrow.vector.FieldVector> |
ArrowConverter.toArrowColumnsString(org.apache.arrow.memory.BufferAllocator bufferAllocator,
Schema schema,
List<List<String>> dataVecRecord)
Convert a set of input strings to arrow columns
|
static List<org.apache.arrow.vector.FieldVector> |
ArrowConverter.toArrowColumnsStringSingle(org.apache.arrow.memory.BufferAllocator bufferAllocator,
Schema schema,
List<String> dataVecRecord)
Convert a set of input strings to arrow columns
|
static List<org.apache.arrow.vector.FieldVector> |
ArrowConverter.toArrowColumnsStringTimeSeries(org.apache.arrow.memory.BufferAllocator bufferAllocator,
Schema schema,
List<List<List<String>>> dataVecRecord)
Convert a set of input strings to arrow columns
for a time series.
|
static List<org.apache.arrow.vector.FieldVector> |
ArrowConverter.toArrowColumnsTimeSeries(org.apache.arrow.memory.BufferAllocator bufferAllocator,
Schema schema,
List<List<List<Writable>>> dataVecRecord)
Convert a set of input strings to arrow columns
for a time series.
|
static <T> List<org.apache.arrow.vector.FieldVector> |
ArrowConverter.toArrowColumnsTimeSeriesHelper(org.apache.arrow.memory.BufferAllocator bufferAllocator,
Schema schema,
List<List<List<T>>> dataVecRecord)
Convert a set of input strings to arrow columns
for a time series.
|
static org.apache.arrow.vector.types.pojo.Schema |
ArrowConverter.toArrowSchema(Schema schema)
Convert a data vec
Schema
to an arrow Schema |
static ArrowWritableRecordBatch |
ArrowConverter.toArrowWritables(List<org.apache.arrow.vector.FieldVector> fieldVectors,
Schema schema)
Convert the input field vectors (the input data) and
the given schema to a proper list of writables.
|
static List<Writable> |
ArrowConverter.toArrowWritablesSingle(List<org.apache.arrow.vector.FieldVector> fieldVectors,
Schema schema)
Return a singular record based on the converted
writables result.
|
static List<List<List<Writable>>> |
ArrowConverter.toArrowWritablesTimeSeries(List<org.apache.arrow.vector.FieldVector> fieldVectors,
Schema schema,
int timeSeriesLength)
Convert the input field vectors (the input data) and
the given schema to a proper list of writables.
|
static void |
ArrowConverter.writeRecordBatchTo(org.apache.arrow.memory.BufferAllocator bufferAllocator,
List<List<Writable>> recordBatch,
Schema inputSchema,
OutputStream outputStream)
Write the records to the given output stream
|
static void |
ArrowConverter.writeRecordBatchTo(List<List<Writable>> recordBatch,
Schema inputSchema,
OutputStream outputStream)
Write the records to the given output stream
|
Constructor and Description |
---|
ArrowRecordWriter(Schema schema) |
ArrowWritableRecordBatch(List<org.apache.arrow.vector.FieldVector> list,
Schema schema)
An index in to an individual
ArrowRecordBatch |
ArrowWritableRecordBatch(List<org.apache.arrow.vector.FieldVector> list,
Schema schema,
int offset,
int rows) |
ArrowWritableRecordTimeSeriesBatch(List<org.apache.arrow.vector.FieldVector> list,
Schema schema,
int timeSeriesStride)
An index in to an individual
ArrowRecordBatch |
Modifier and Type | Method and Description |
---|---|
static DataAnalysis |
AnalyzeLocal.analyze(Schema schema,
RecordReader rr)
Analyse the specified data - returns a DataAnalysis object with summary information about each column
|
static DataAnalysis |
AnalyzeLocal.analyze(Schema schema,
RecordReader rr,
int maxHistogramBuckets)
Analyse the specified data - returns a DataAnalysis object with summary information about each column
|
static DataQualityAnalysis |
AnalyzeLocal.analyzeQuality(Schema schema,
RecordReader data)
Analyze the data quality of data - provides a report on missing values, values that don't comply with schema, etc
|
static DataQualityAnalysis |
AnalyzeLocal.analyzeQualitySequence(Schema schema,
SequenceRecordReader data)
Analyze the data quality of sequence data - provides a report on missing values, values that don't comply with schema, etc
|
static List<List<Writable>> |
LocalTransformExecutor.convertStringInput(List<List<String>> stringInput,
Schema schema)
Convert a string time series to
the proper writable set based on the schema.
|
static List<List<List<Writable>>> |
LocalTransformExecutor.convertStringInputTimeSeries(List<List<List<String>>> stringInput,
Schema schema)
Convert a string time series to
the proper writable set based on the schema.
|
static List<List<String>> |
LocalTransformExecutor.convertWritableInputToString(List<List<Writable>> stringInput,
Schema schema)
Convert a string time series to
the proper writable set based on the schema.
|
static List<List<List<String>>> |
LocalTransformExecutor.convertWritableInputToStringTimeSeries(List<List<List<Writable>>> stringInput,
Schema schema)
Convert a string time series to
the proper writable set based on the schema.
|
static Map<String,Set<Writable>> |
AnalyzeLocal.getUnique(List<String> columnNames,
Schema schema,
RecordReader data)
Get a list of unique values from the specified columns.
|
static Set<Writable> |
AnalyzeLocal.getUnique(String columnName,
Schema schema,
RecordReader data)
Get a list of unique values from the specified columns.
|
static Map<String,Set<Writable>> |
AnalyzeLocal.getUniqueSequence(List<String> columnNames,
Schema schema,
SequenceRecordReader sequenceData)
Get a list of unique values from the specified columns of a sequence
|
static Set<Writable> |
AnalyzeLocal.getUniqueSequence(String columnName,
Schema schema,
SequenceRecordReader sequenceData)
Get a list of unique values from the specified column of a sequence
|
Modifier and Type | Method and Description |
---|---|
Schema |
TokenizerBagOfWordsTermSequenceIndexTransform.transform(Schema inputSchema) |
Modifier and Type | Method and Description |
---|---|
Schema |
TokenizerBagOfWordsTermSequenceIndexTransform.transform(Schema inputSchema) |
Modifier and Type | Method and Description |
---|---|
static Schema |
PythonUtils.fromPythonVariables(PythonVariables input)
Create a
Schema
from PythonVariables . |
Schema |
PythonCondition.getInputSchema() |
Schema |
PythonTransform.getInputSchema() |
Schema |
PythonCondition.transform(Schema inputSchema) |
Schema |
PythonTransform.transform(Schema inputSchema) |
Modifier and Type | Method and Description |
---|---|
static PythonVariables |
PythonUtils.fromSchema(Schema input)
Create a
Schema from an input
PythonVariables
Types are mapped to types of the same name |
static PythonVariables |
PythonUtils.schemaToPythonVariables(Schema schema)
Convert a
Schema
to PythonVariables |
void |
PythonCondition.setInputSchema(Schema inputSchema) |
void |
PythonTransform.setInputSchema(Schema inputSchema) |
Schema |
PythonCondition.transform(Schema inputSchema) |
Schema |
PythonTransform.transform(Schema inputSchema) |
Constructor and Description |
---|
PythonTransform(String code,
PythonVariables inputs,
PythonVariables outputs,
String name,
Schema inputSchema,
Schema outputSchema,
String outputDict,
boolean returnAllInputs,
boolean setupAndRun) |
Modifier and Type | Method and Description |
---|---|
static Schema |
DataFrames.fromStructType(org.apache.spark.sql.types.StructType structType)
Create a datavec schema
from a struct type
|
Modifier and Type | Method and Description |
---|---|
static Pair<Schema,org.apache.spark.api.java.JavaRDD<List<Writable>>> |
DataFrames.toRecords(org.apache.spark.sql.Dataset<org.apache.spark.sql.Row> dataFrame)
Create a compatible schema
and rdd for datavec
|
static Pair<Schema,org.apache.spark.api.java.JavaRDD<List<List<Writable>>>> |
DataFrames.toRecordsSequence(org.apache.spark.sql.Dataset<org.apache.spark.sql.Row> dataFrame)
Convert the given DataFrame to a sequence
Note: It is assumed here that the DataFrame has been created by DataFrames.toDataFrameSequence(Schema, JavaRDD) . |
Modifier and Type | Method and Description |
---|---|
static DataAnalysis |
AnalyzeSpark.analyze(Schema schema,
org.apache.spark.api.java.JavaRDD<List<Writable>> data)
Analyse the specified data - returns a DataAnalysis object with summary information about each column
|
static DataAnalysis |
AnalyzeSpark.analyze(Schema schema,
org.apache.spark.api.java.JavaRDD<List<Writable>> data,
int maxHistogramBuckets) |
static DataQualityAnalysis |
AnalyzeSpark.analyzeQuality(Schema schema,
org.apache.spark.api.java.JavaRDD<List<Writable>> data)
Analyze the data quality of data - provides a report on missing values, values that don't comply with schema, etc
|
static DataQualityAnalysis |
AnalyzeSpark.analyzeQualitySequence(Schema schema,
org.apache.spark.api.java.JavaRDD<List<List<Writable>>> data)
Analyze the data quality of sequence data - provides a report on missing values, values that don't comply with schema, etc
|
static SequenceDataAnalysis |
AnalyzeSpark.analyzeSequence(Schema schema,
org.apache.spark.api.java.JavaRDD<List<List<Writable>>> data) |
static SequenceDataAnalysis |
AnalyzeSpark.analyzeSequence(Schema schema,
org.apache.spark.api.java.JavaRDD<List<List<Writable>>> data,
int maxHistogramBuckets) |
static org.apache.spark.sql.types.StructType |
DataFrames.fromSchema(Schema schema)
Convert a datavec schema to a
struct type in spark
|
static org.apache.spark.sql.types.StructType |
DataFrames.fromSchemaSequence(Schema schema)
Convert the DataVec sequence schema to a StructType for Spark, for example for use in
DataFrames.toDataFrameSequence(Schema, JavaRDD) }
Note: as per DataFrames.toDataFrameSequence(Schema, JavaRDD) }, the StructType has two additional columns added to it:- Column 0: Sequence UUID (name: DataFrames.SEQUENCE_UUID_COLUMN ) - a UUID for the original sequence- Column 1: Sequence index (name: DataFrames.SEQUENCE_INDEX_COLUMN - an index (integer, starting at 0) for the position
of this record in the original time series. |
static Map<String,List<Writable>> |
AnalyzeSpark.getUnique(List<String> columnNames,
Schema schema,
org.apache.spark.api.java.JavaRDD<List<Writable>> data)
Get a list of unique values from the specified columns.
|
static List<Writable> |
AnalyzeSpark.getUnique(String columnName,
Schema schema,
org.apache.spark.api.java.JavaRDD<List<Writable>> data)
Get a list of unique values from the specified columns.
|
static Map<String,List<Writable>> |
AnalyzeSpark.getUniqueSequence(List<String> columnNames,
Schema schema,
org.apache.spark.api.java.JavaRDD<List<List<Writable>>> sequenceData)
Get a list of unique values from the specified columns of a sequence
|
static List<Writable> |
AnalyzeSpark.getUniqueSequence(String columnName,
Schema schema,
org.apache.spark.api.java.JavaRDD<List<List<Writable>>> sequenceData)
Get a list of unique values from the specified column of a sequence
|
static Writable |
AnalyzeSpark.max(org.apache.spark.api.java.JavaRDD<List<Writable>> allData,
String columnName,
Schema schema)
Get the maximum value for the specified column
|
static Writable |
AnalyzeSpark.min(org.apache.spark.api.java.JavaRDD<List<Writable>> allData,
String columnName,
Schema schema)
Get the minimum value for the specified column
|
static org.apache.spark.api.java.JavaRDD<List<Writable>> |
Normalization.normalize(Schema schema,
org.apache.spark.api.java.JavaRDD<List<Writable>> data)
Scale all data 0 to 1
|
static org.apache.spark.api.java.JavaRDD<List<Writable>> |
Normalization.normalize(Schema schema,
org.apache.spark.api.java.JavaRDD<List<Writable>> data,
double min,
double max)
Scale based on min,max
|
static org.apache.spark.api.java.JavaRDD<List<Writable>> |
Normalization.normalize(Schema schema,
org.apache.spark.api.java.JavaRDD<List<Writable>> data,
double min,
double max,
List<String> skipColumns)
Scale based on min,max
|
static org.apache.spark.api.java.JavaRDD<List<Writable>> |
Normalization.normalize(Schema schema,
org.apache.spark.api.java.JavaRDD<List<Writable>> data,
List<String> skipColumns)
Scale all data 0 to 1
|
static org.apache.spark.api.java.JavaRDD<List<List<Writable>>> |
Normalization.normalizeSequence(Schema schema,
org.apache.spark.api.java.JavaRDD<List<List<Writable>>> data) |
static org.apache.spark.api.java.JavaRDD<List<List<Writable>>> |
Normalization.normalizeSequence(Schema schema,
org.apache.spark.api.java.JavaRDD<List<List<Writable>>> data,
double min,
double max)
Normalize each column of a sequence, based on min/max
|
static org.apache.spark.api.java.JavaRDD<List<List<Writable>>> |
Normalization.normalizeSequence(Schema schema,
org.apache.spark.api.java.JavaRDD<List<List<Writable>>> data,
double min,
double max,
List<String> excludeColumns)
Normalize each column of a sequence, based on min/max
|
static List<Writable> |
DataFrames.rowToWritables(Schema schema,
org.apache.spark.sql.Row row)
Convert a given Row to a list of writables, given the specified Schema
|
static List<Writable> |
AnalyzeSpark.sampleFromColumn(int count,
String columnName,
Schema schema,
org.apache.spark.api.java.JavaRDD<List<Writable>> data)
Randomly sample values from a single column
|
static List<Writable> |
AnalyzeSpark.sampleFromColumnSequence(int count,
String columnName,
Schema schema,
org.apache.spark.api.java.JavaRDD<List<List<Writable>>> sequenceData)
Randomly sample values from a single column, in all sequences.
|
static List<Writable> |
AnalyzeSpark.sampleInvalidFromColumn(int numToSample,
String columnName,
Schema schema,
org.apache.spark.api.java.JavaRDD<List<Writable>> data)
Randomly sample a set of invalid values from a specified column.
|
static List<Writable> |
AnalyzeSpark.sampleInvalidFromColumn(int numToSample,
String columnName,
Schema schema,
org.apache.spark.api.java.JavaRDD<List<Writable>> data,
boolean ignoreMissing)
Randomly sample a set of invalid values from a specified column.
|
static List<Writable> |
AnalyzeSpark.sampleInvalidFromColumnSequence(int numToSample,
String columnName,
Schema schema,
org.apache.spark.api.java.JavaRDD<List<List<Writable>>> data)
Randomly sample a set of invalid values from a specified column, for a sequence data set.
|
static Map<Writable,Long> |
AnalyzeSpark.sampleMostFrequentFromColumn(int nMostFrequent,
String columnName,
Schema schema,
org.apache.spark.api.java.JavaRDD<List<Writable>> data)
Sample the N most frequently occurring values in the specified column
|
static org.apache.spark.sql.Dataset<org.apache.spark.sql.Row> |
DataFrames.toDataFrame(Schema schema,
org.apache.spark.api.java.JavaRDD<List<Writable>> data)
Creates a data frame from a collection of writables
rdd given a schema
|
static org.apache.spark.sql.Dataset<org.apache.spark.sql.Row> |
DataFrames.toDataFrameSequence(Schema schema,
org.apache.spark.api.java.JavaRDD<List<List<Writable>>> data)
Convert the given sequence data set to a DataFrame.
|
static org.apache.spark.api.java.JavaRDD<List<Writable>> |
Normalization.zeromeanUnitVariance(Schema schema,
org.apache.spark.api.java.JavaRDD<List<Writable>> data)
Normalize by zero mean unit variance
|
static org.apache.spark.api.java.JavaRDD<List<Writable>> |
Normalization.zeromeanUnitVariance(Schema schema,
org.apache.spark.api.java.JavaRDD<List<Writable>> data,
List<String> skipColumns)
Normalize by zero mean unit variance
|
static org.apache.spark.api.java.JavaRDD<List<List<Writable>>> |
Normalization.zeroMeanUnitVarianceSequence(Schema schema,
org.apache.spark.api.java.JavaRDD<List<List<Writable>>> sequence)
Normalize the sequence by zero mean unit variance
|
static org.apache.spark.api.java.JavaRDD<List<List<Writable>>> |
Normalization.zeroMeanUnitVarianceSequence(Schema schema,
org.apache.spark.api.java.JavaRDD<List<List<Writable>>> sequence,
List<String> excludeColumns)
Normalize the sequence by zero mean unit variance
|
Constructor and Description |
---|
SequenceToRows(Schema schema) |
ToRow(Schema schema) |
Modifier and Type | Method and Description |
---|---|
static void |
SparkUtils.writeSchema(String outputPath,
Schema schema,
org.apache.spark.api.java.JavaSparkContext sc)
Write a schema to a HDFS (or, local) file in a human-readable format
|
Copyright © 2020. All rights reserved.