ND4J/Deeplearning4j: Added support for CUDA 10.0. Dropped support for CUDA 8.0. (1.0.0-beta3 release has CUDA 9.0, 9.2 and 10.0 support)
SameDiff now supports training and evaluation from DataSetIterator and MultiDataSetIterator. Evaluation classes have been moved to ND4J.
DL4J Spark training (gradient sharing) is now fully fault tolerant, and has improvements for threshold adaption (potentially more robust convergence). Ports can now be easily configured independently on master/workers.
ParallelInference: Instances can now update the model in real-time (without re-init) Link
ParallelInferenc: Added ParallelInference INPLACE mode Link
Added validation for incompatible loss/activation function combinations (such as softmax+nOut=1, or sigmoid+mcxent). New validation can be disabled using outputValidation(false) Link
Spark training: Added full fault tolerance (robust failure recovery) for gradient sharing implementation LinkLink
Spark training now supports configuring ports more flexibly (and differently for different workers) using PortSupplier LinkLink
Spark training: overhauled gradient sharing threshold adaption algorithms; made it possible to customize threshold settings, plus made defaults more robust to initial threshold configuration improving convergence speed in some cases. Link
Spark training: implemented chunked messaging to reduce memory requirements (and insufficient buffer length issues) for large messages Link
Spark training: Added MeshBuildMode configuration for improved scalability for large clusters Link
Spark network data pipelines: added FileBatch, FileBatchRecordReader etc for “small files” (images etc) distributed training use cases Link
Added FailureTestingListener for fault tolerance/debugging purposes Link
Upgraded Apache Lucene/Solr to version 7.5.0 (from 7.4.0) Link
Added system properties (org.deeplearning4j.tempdir and org.nd4j.tempdir) to allow overriding of the temporary directories ND4J and DL4J use LinkLink
Mode MultiLayerNetwork/ComputationGraph.clearLayerStates methods public (was protected) Link
AbstactLayer.layerConf() method is now public Link
ParallelWrapper module now no longer has a Scala version suffix for artifact id; new artifact id is deeplearning4j-parallel-wrapperLink
Improved validation and error mesages for invalid inputs/labels in Yolo2OutputLayer Link
Spark training: added SharedTrainingMaster.Builder.workerTogglePeriodicGC and .workerPeriodicGCFrequency to easily configure the ND4J garbage collection configuration on workers. Set default GC to 5 seconds on workers Link
Spark training: added threshold encoding debug mode (logs current threshold and encoding statistics on each worker during training). Enable using SharedTrainingConfiguration.builder.encodingDebugMode(true). Note this operation has computational overhead. Link
Deeplearning4J: Bug Fixes and Optimizations
Fixed an issue where L1/L2 and updaters (Adam, Nesterov, etc) were applied before dividing gradients by minibatch to obtain average gradient. To maintain old behaviour, use NeuralNetConfiguration.Builder.legacyBatchScaledL2(true)Link.
Note that learning rates may need to be decreased for some updaters (such as Adam) to account for this change vs. earlier versions. Some other updaters (such as SGD, NoOp, etc) should be unaffected.
Note that deserialized (loaded) configurations/networks saved in 1.0.0-beta2 or earlier will default to old behaviour for backward compatibility. All new networks (created in 1.0.0-beta3) will default to the new behaviour.
Fixed an issue where EarlyStoppingScoreCalculator would not correctly handle “maximize score” cases instead of minimizing Link
Fixed order (BGR vs. RGB) for VGG16ImagePreProcessor channel offset values Link
Fixed bug with variational autoencoders using weight noise Link
Fixed issue with BaseDataSetIterator not respecting the ‘maximum examples’ configuration Link
Optimization: A workspace is now used for ComputationGraph/MultiLayerNetwork evaluation methods (avoids allocating off-heap memory during evaluation that must be cleaned up by garbage collector) Link
Fixed an issue where shuffling combined with a subset for MnistDataSetIterator would not maintain the same subset between resets Link
Fix issue with CNN to/from RNN preprocessors handling of mask arrays Link
Fixed issue with VGG16 non-pretrained configuration in model zoo Link
Fixed issue with TransferLearning nOutReplace where multiple layers in a row are modified Link
Fixed issue with CuDNN workspaces where backpropagation is performed outside of a standard fit call Link
Fixed an issue with dropout masks being cleared prematurely on output layers in ComputationGraph Link
RecordReaderMultiDataSetIterator now supports 5D arrays (for 3D CNNs) Link
Fixed bug in multi input/output ComputationGraphs with TBPTT combined with both masking and different number of input/output arrays Link
Improved input validation/exceptions for batch normalization layer Link
Fixed bug with TransferLearning GraphBuilder nOutReplace when combined with subsampling layers Link
SimpleRnnParamInitializer now properly respects bias initialization configuration Link
Fixed SqueezeNet zoo model non-pretrained configuration Link
Fixed Xception zoo model non-pretrained configuration Link
Fixed an issue with some evaluation signatures for multi-output ComputationGraphs Link
Improved MultiLayerNetwork/ComputationGraph summary method formatting for large nets Link
Fixed an issue where gradient normalization could result in NaNs if gradient is exactly 0.0 for all parameters in a layer Link
Fixed an issue where MultiLayerNetwork/ComputationGraph.setLearningRate could throw an exception for SGD and NoOp updaters Link
Fixed an issue with StackVertex plus masking in some rare cases Link
Fixed an issue with JSON deserialization of frozen layers in pre-1.0.0-alpha format Link
Fixed an issue where GraphBuilder.removeVertex can fail under some limited circumstances Link
Fixed a bug in CacheableExtractableDataSetFetcher Link
DL4J Spark training: Fixed issues with thread/device affinity for multi-GPU training + evaluation Link
DL4J Spark training: Made all Aeron threads daemon threads to prevent Aeron from stopping JVM shutdown when all other threads have completed Link
Added cudnnAllowFallback configuration for BatchNormalization layer (fallback to built-in implementation if CuDNN fails unexpectedly) Link
Fixed some rare concurrency issues with multi-worker (multi-GPU) nodes for Spark training LinkLink
Fixed an issue with BatchNormalization layers that prevented the mean/variance estimates from being synced properly on each worker for GradientSharing training, causing convergence issues Link
Added a check to detect ZipSlip CVE attempts in ArchiveUtils Link
DL4J Spark training and evaluation: methods now use Hadoop Configuration from Spark context to ensure runtime-set configuration is available in Spark functions reading directly from remote storage (HDFS etc) Link
MultiLayerNetwork and ComputationGraph now properly support more than Integer.MAX_VALUE parameters LinkLink
Added data validation for Nd4j.readTxt - now throws exception on invalid input instead of returning incorrect values Link
Fixed an issue with KNN implementation where a deadlock could occur if an invalid distance function (one returning “distances” less than 0) was utilized Link
Added synchronization to loading of Keras import models to avoid thread safety issues in the underlying HDFS library used for loading Link
Fixed rare issue for Async(Multi)DataSetIterator with large prefetch values Link
Deeplearning4J: API Changes (Transition Guide): 1.0.0-beta2 to 1.0.0-beta3
IEvaluation classes in DL4J have been deprecated and moved to ND4J so they are available for SameDiff training. Functionality and APIs are unchanged
MultiLayerConfiguration/ComputationGraphConfiguration pretrain(boolean) and backprop(boolean) have been deprecated and are no longer used. Use fit and pretrain/pretrainLayer methods instead. Link
ParallelWrapper module now no longer has a Scala version suffix for artifact id; new artifact id is deeplearning4j-parallel-wrapper which should be used instead Link
deeplearning4j-nlp-korean module now has Scala version suffix due to scala dependencies; new artifact ID is deeplearning4j-nlp-korean_2.10 and deeplearning4j-nlp-korean_2.11Link
Deeplearning4J: Known issues: 1.0.0-beta3
Running multiple Spark training jobs simultaneously on the one physical node (i.e., multiple JVMs from one or more Spark jobs) may cause problems with network communication. A workaround for this is to manually set a unique stream ID manually in the VoidConfiguration. Use a unique (or random) integer value for different jobs Link
Added SameDiff training and evaluation: SameDiff instances can now be trained directly using DataSetIterator and MultiDataSetIterator, and evaluated using IEvaluation instances (that have been moved from ND4J to DL4J) Link
Added GraphServer implementation: c++ inference server for SameDiff (and Tensorflow, via TF import) with Java API Link
SameDiff instances can now be loaded from serialized FlatBuffers format (SameDiff.asFlatFile plus fromFlatFile) LinkLink
Added MKL-DNN support for some operations (Conv2d, etc) Link
Upgraded ND4J (and DataVec) to Arrow 0.11.0 Link, which also fixes Link
Added Nd4j.where op method (same semantics as numpy.where) Link
Added Nd4j.stack op method (combine arrays + increase array rank by 1) Link
Libnd4j custom ops: conv op weight layouts are now not dependent on the input format (NCHW/NHWC) - now always [kH, kW, inChannels, outChannels] for 2d CNNs, [kH, kW, kD, inChannels, outChannels] for 3d CNNs. Link, Link
embedding_lookup op now supports multiple input arrays Link
Matrix determinant op edge case (rank 0 result) shape fix Link
SameDiff TensorFlow import: fixes for multiple operations Link, Link, Link, Link
SameDiff: Improved error handling for multiple outputs case Link
Fixed issue where INDArray.permute would not correctly throw an exception for invalid length case Link
Fixed issues with INDArray.get/put with SpecifiedIndex Link, Link
Minor change to DataSet.merge - signature now accepts any DataSet subtypes Link
INDArray.transposei operation was not in-place Link
Fixed issues with INDArray.mmul with MMulTranspose Link
Added additional order validation for ND4J creation methods (create, rand, etc) Link
Fix for ND4J binary deserialization (BinarySerde) when deserializing from heap byte buffers Link
Fixed issue with Nd4j-common ClassPathResource path resolution in some IDEs Link
Fixed issue where INDArray.get(interval) on rank 1 array would return rank 2 array Link
Fixed a validation issue with Nd4j.gemm/mmuli on views LinkLink
INDArray.assign(INDArray) no longer allows assigning different shape arrays (other than scalar/vector cases) Link
NDarrayStrings (and INDArray.toString()) now always uses US locale when formatting numbers Link
Fixed an issue with GaussianDistribution specific to V100 GPUs Link
Fixed an issue with bitmap compression/encoding specific to V100 GPUs Link
Transforms.softmax now throws an error on unsupported shapes instead of simply not applying operation Link
VersionCheck functionality: handle case where SimpleFileVisitor is not available on earlier versions of Android Link
SameDiff convolution layer configuration (Conv2dConfig/Conv3dConfig/Pooling3dConfig etc) have had parameter names aligned Link
ND4J: API Changes (Transition Guide): 1.0.0-beta2 to 1.0.0-beta3
CUDA 8.0 support has been removed. CUDA 9.0, 9.2 and 10.0 support is available in 1.0.0-beta3
nd4j-base64 module contents have been deprecated; use the equivalent classes in nd4j-api from now on Link
Some classes in nd4j-jackson module has been deprecated; use the equivalent classes in nd4j-api from now on Link
ND4J: Known issues: 1.0.0-beta3
Android users may need to manually exclude the (now deprecated) module nd4j-base64. This is due to org.nd4j.serde.base64.Nd4jBase64 class being present in both nd4j-api and nd4j-base64 modules. Both versions have identical content. Use exclude group: 'org.nd4j', module: 'nd4j-base64' to exclude.
ND4J/Deeplearning4j: Added support for CUDA 9.2. Dropped support for CUDA 9.1. (1.0.0-beta2 release has CUDA 8.0, 9.0 and 9.2 support)
Deeplearning4j: New SameDiff layers with training support - LinkLink
Deeplearning4j resource (datasets, pretrained models) storage directory can now be configured via DL4JResources.setBaseDirectory method or org.deeplearning4j.resources.directory system property
ND4J: all indexing is now done with longs instead of ints to allow for arrays with dimensions and lengths greater than Integer.MAX_VALUE (approx. 2.1 billion)
ND4J: nd4j-native-platform will now use Intel MKL-DNN as the default/bundled BLAS implementation (replacing OpenBLAS as the previous default)
Deeplearning4j: Added Out-of-memory (OOM) crash dump reporting functionality. Provides a dump with memory use and configuration if training/inference OOMs (to assist with debugging and tuning memory configuration).
Deeplearning4j - new layers: Locally connected 1d Link, Locally connected 2d Link
Added new SameDiff layers (automatic differentiation - only single class, forward pass definition required) to DL4J with full training support - SameDiffLayer, SameDiffVertex, SameDiffOutputLayer, SameDiffLambdaLayer, SameDiffLambdaVertex - note that these are CPU-only execution for now LinkLinkLink
Resource (datasets, pretrained models) storage directory can now be configured via DL4JResources.setBaseDirectory method or org.deeplearning4j.resources.directory system property. Note that it is also possible to set a different base location for downloads (for local mirrors of DL4J resources) Link
Added Out-of-memory (OOM) crash dump reporting functionality. Provides a dump with memory use and configuration if training/inference OOMs. Same information is available (without a crash) for MultiLayerNetwork/ComputationGraph.memoryInfo methods. Can be disabled (or output directory set) using system properties - Link
Added Composite[Multi]DataSetPreProcessor to enable multiple [Multi]DataSetPreProcessors to be applied in a single iterator Link
Added ComputationGraph evaluate methods for multi-output networks: evaluate(DataSetIterator, Map<Integer,IEvaluation>) and evaluate(MultiDataSetIterator, Map<Integer,IEvaluation>)Link
Added JointMultiDataSetIterator - utility iterator used to create MultiDataSetIterator from multiple DataSetIterators Link
GraphVertices may now have trainable parameters directly (not just enclose layers with trainable parameters) Link
Added MultiLayerNetwork/ComputationGraph getLearningRate methods Link
Added RandomDataSetIterator and RandomMultiDataSetIterator (mainly for testing/debugging) LinkLink
Added cyclical “1cycle” schedule for learning rate schedules etc - Link
RDD repartitioning for Spark training is more configurable (adds Repartitioner interface) Link
Added ComputationGraph.getIterationCount() and .getEpochCount() for consistency with MultiLayerNetwork Link
Spark evaluation: added evaluation method overloads that allow specifying the number of evaluation workers (less than number of Spark threads) Link
CnnSentenceDataSetIterator now has a Format argument, and supports outputting data for RNNs and 1D CNNs Link
Added ComputationGraph/MultiLayerNetwork.pretrain((Multi)DataSetIterator, int epochs) method overloads Link
MultiLayerNetwork and ComputationGraph now have output method overloads where the network output can be placed in the user-specified workspace, instead of being detached LinkLink. This can be used to avoid creating INDArrays that need to be garbage collected before native memory can be freed.
EmbeddingSequenceLayer now supports [minibatch,1,seqLength] format sequence data in addition to [minibatch,seqLength] format data Link
CuDNN batch norm implementation will now be used for rank 2 input, not just rank 4 input Link
Environment variables and system properties for DL4J have been centralized into DL4JResources and DL4JEnvironmentVars classes, with proper descriptions LinkLink
MultiLayerNetwork and ComputationGraph output/feedForward/fit methods are now thread-safe via synchronization. Note that concurrent use is not recommended due to performance (instead: use ParallelInference); however the now-synchronized methods should avoid obscure errors due to concurrent modifications Link
BarnesHutTSNE now throws a useful exception in the case where the distance metric is undefined (for example, all zeros plus cosine similarity) Link
Deeplearning4J: Bug Fixes and Optimizations
ComputationGraph.addListeners was not working correctly if listeners were already present Link, Link
TinyImageNetDataSetIterator did not validate/correctly use input shape configuration Link, Link
BatchNormalization layer now correctly asserts that nOut is set if required (instead of unfriendly shape errors later) Link
Fixed issue where OutputLayer may not initialize parameter constraints correctly Link
Fixed performance issue with Nesterov updater using CPU-only op for CUDA execution Link
Removed TerminationCondition for DL4J optimizers - was not used in practice, and had minor overhead Link
Fixed issue where EvaluativeListener could hit a workspace validation exception when workspaces are enabled Link
Fixed issue where TrainingListener.onEpochStart/onEpochEnd were not being called correctly for ComputationGraph Link
Fixed workspace issue with TensorFlowCnnToFeedForwardPreProcessor Link
Performance optimization for BatchNormalization when using CuDNN Link
Performance optimization: Dropout will be applied in-place when safe to do so, avoiding a copy Link
Reduced memory use for CuDNN: CuDNN working memory is now shared and reused between layers within a network Link
CuDNN batch normalization implementation would fail with FP16 datatype Link
Fixed issue Bidirectional LSTM may incorrectly use workspaces causing an exception Link
Fixed issue with early stopping where scores to be maximized (accuracy, f1, etc) were not properly triggering termination conditions Link
Fixed issue where label mask counter could be incorrectly incremented in ComputationGraph.computeGradientAndScore() Link
ComputationGraph was not setting lastEtlTime field during training Link
Fixed issue with AutoEncoder layer when workspaces are enabled Link
Fixed issue with EmbeddingSequenceLayer use of mask arrays Link
Lombok is now provided scope everywhere, isn’t on user classpath when using DL4J Link
Fixed issue where WordVectorSerializer.readParagraphVectors(File) initialization of label source Link
Spark training (gradient sharing) now properly handles empty partition edge case when encountered during training Link
Errors are propagated better/more consistently for Spark gradient sharing training Link
Fixed issue with 1D CNN layers with mask arrays and stride > 1 (masks not being correctly downsized) Link
DL4J Batch norm implementation was not correctly adding epsilon value during inference, only during training (CuDNN unaffected) Link
CuDNN subsampling layers with max pooling and ConvolutionMode.SAME may have taken padding value (0) as the maximum for border values when all non-padding values are less than 0 Link
Spark training with gradient sharing now passes listeners to workers correctly Link
Fixed rare (and non-terminal) concurrent modification issue with UI and FileStatsStorage Link
CuDNN convolution layer now supports dilation > 2 (previously: used DL4J conv layer implementation as a fallback) Link
Yolo2OutputLayer now implements computeScoreForExamples() Link
SequenceRecordReeaderDataSetIterator now handles the “no labels” case correctly Link
Fixed issue where BarnesHutTSNE could hit a workspace validation exception Link
EMNIST iterator could produce incorrect data in some cases after a reset Link
Deeplearning4J: API Changes (Transition Guide): 1.0.0-beta to 1.0.0-beta2
GravesLSTM has been deprecated in favor of LSTM due to lack of CuDNN support but otherwise similar accuracy to in practice. Use LSTM class instead.
deeplearning4j-modelexport-solr: now uses Lucene/Solr version 7.4.0 (was 7.3.0) Link
Mask arrays for CNN2d layers must be in broadcastable 4d format: [minibatch,depth or 1, height or 1, width or 1] - previously they were 2d with shape [minibatch,height] or [minibatch,width]. This provents ambiguity in later cases (pooling layers), and allows for more complex masking scenarios (such as masking for different image sizes in same minibatch). Link
Some older/deprecated Model and Layer methods have been removed. (validateInput(), initParams()). Some custom layers may need to be updated as a result Link
Deelpearning4J: 1.0.0-beta2 Known Issues
Windows users are unable to load the HDF5 files used in SvhnLabelProvider (used in HouseNumberDetection example). Linux/Mac users are unaffected. A workaround for windows users is to add the sonatype snapshot dependency org.bytedeco.javacpp-presets:hdf5-platform:jar:1.10.2-1.4.3-SNAPSHOTLink
ND4J: all indexing is now done with longs instead of ints to allow for arrays with dimensions and lengths greater than Integer.MAX_VALUE (approx. 2.1 billion)
Added the ability to write Numpy .npy format using Nd4j.writeAsNumpy(INDArray,File) and convert an INDArray to a numpy strict in-memory using Nd4j.convertToNumpy(INDArray)Link
ND4j-common ClassPathResource: added ClassPathResource.copyDirectory(File) Link
SameDiff: A significant number of new ops, and backprop implementations for existing ops
Added Nd4j.randomBernoulli/Binomial/Exponential convenience methods Link
Added way to disable/suppress ND4J initialization logging via org.nd4j.log.initialization system property Link
SameDiff class - most op/constructor methods now have complete/useful javadoc Link
Workspaces can now be disabled globally, ignoring workspace configuration. This is mainly used for debugging; use Nd4j.getWorkspaceManager().setDebugMode(DebugMode.DISABLED) or Nd4j.getWorkspaceManager().setDebugMode(DebugMode.SPILL_EVERYTHING); to enable this. Link [Link]
Added EnvironmentalAction API for environment variable processing Link
ND4J environment variables and system properties have been centralized in ND4jEnvironmentVars and ND4jSystemProperties classes Link and Link
ND4J: Bug Fixes and Optimizations
SameDiff: a significant number of bug fixes for execution and individual ops
Fixed issue where INDArray.toDoubleArray() with true scalars (rank 0 arrays) Link
Fixed issue with DataSet.sample() not working for rank 3+ features Link
IActivation implementations now validate/enforce same shape for activations and gradients Link
Fixed issue with muliColumnVector where vector is 1d Link
ImagePreProcessingScaler now supports serialization via NormalizerSerializerStrategy and ModelSerializer Link
Performance optimization for threshold encoding used in DL4J’s Spark gradient sharing distributed training implementation Link
SameDiff: Fixed issue where memory wasn’t always released after execution Link
DataSet.save() and MultiDataSet.save() methods now save example metadata when present Link
Fixed issue with KFoldIterator when dataset does not divide equally into folds with no remainder Link
Fixed issue where version check functionality could fail to load resources if resources are on a path with spaces Link
ND4J: Known Issues
ND4J: API Changes (Transition Guide): 1.0.0-beta to 1.0.0-beta2
CUDA 9.1 support has been removed. CUDA 8.0, 9.0 and 9.2 support is available
Due to long indexing changes, long/long should be used in place of int/int in some places (such as INDArray.size(int), INDArray.shape())
Simplified DataSetIterator API: totalExamples(), cursor() and numExamples() - these were unsupported on most DataSetIterator implementations, and not used in practice for training. Custom iterators should remove these methods also Link
Long-deprecated DataSet.getFeatureMatrix() has been removed. Use DataSet.getFeatures() instead. Link
Unused and not properly tested/maintained utility class BigDecimalMath has been removed. Users should find an aternative library for this functionality, if required.
Not properly maintained complex number support classes (IComplexNumber, IComplexNDArray) have been removed entirely Link
Added SparkComputationGraph.feedForwardWithKey overload with feature mask support Link
Added MultiLayerNetwork.calculateGradients method (for easily getting parameter and input gradients, for example for some model interpretabilithy approaches) LinkLink
Added support to get input/activation types for each layer from configuration: ComputationGraphConfiguration.getLayerActivationTypes(InputType...), ComputationGraphConfiguration.GraphBuilder.getLayerActivationTypes(), NeuralNetConfiguration.ListBuilder.getLayerActivationTypes(), MultiLayerConfiguration.getLayerActivationTypes(InputType) methods Link
Evaluation.stats() now prints confusion matrix in easier to read matrix format, rather than list format Link
Added ModelSerializer.addObjectToFile, .getObjectFromFile and .listObjectsInFile for storing arbitrary Java objects in same file as saved network Link
Added SpatialDropout support (with Keras import support) Link
Added MultiLayerNetwork/ComputationGraph.fit((Multi)DataSetIterator, int numEpochs) overloads Link
Added performance (hardware) listeners: SystemInfoPrintListener and SystemInfoFilePrintListenerLink
Deeplearning4J: Bug Fixes and Optimizations
Performance and memory optimizations via optimizations of internal use of workspaces Link
Reflections library has entirely been removed from DL4J and is no longer required for custom layer serialization/deserialization Link, Link
Fixes issues with custom and some Keras import layers on Android
RecordReaderMultiDataSetIterator will no longer try to convert unused columns to numerical values Link
Added new model zoo models:
Fixes for Android compilation (removed duplicate classes, aligned versions, removed some dependencies) LinkLinkLink
Fix for RecordReaderMulitDataSetIterator where output could be incorrect for some constructors Link
Non-frozen layers before a frozen layer will no longer be skipped during backprop (useful for GANs and similar architectures) LinkLink
Fixed issue where ComputationGraph topological sort may not be consistent on all platforms; could sometimes break ComputationGraphs (with multiple valid topological orderings) trained on PC and deployed on Android Link
Fixed issue with CuDNN batch norm using 1-decay instead of decayLink
deeplearning4j-cuda no longer throws exceptions if present on classpath with nd4j-native backend set to higher priority Link
WordVectorSerializer now deletes temp files immediately once done Link
Deeplearning4J: API Changes (Transition Guide): 1.0.0-alpha to 1.0.0-beta
WorkspaceMode.SINGLE and SEPARATE have been deprecated; use WorkspaceMode.ENABLED instead
Internal layer API changes: custom layers will need to be updated to the new Layer API - see built-in layers or custom layer example
Custom layers etc in pre-1.0.0-beta JSON (ModelSerializer) format need to be registered before they can be deserialized due to JSON format change. Built-in layers and models saved in 1.0.0-beta or later do not require this. Use NeuralNetConfiguration.registerLegacyCustomClassesForJSON(Class) for this purpose
IterationListener has been deprecated in favor of TrainingListener. For existing custom listeners, switch from implements TrainingListener to extends BaseTrainingListenerLink
ExistingDataSetIterator has been deprecated; use fit(DataSetIterator, int numEpochs) method instead
Deelpearning4J: 1.0.0-beta Known Issues
ComputationGraph TrainingListener onEpochStart and onEpochEnd methods are not being called correctly
DL4J Zoo Model FaceNetNN4Small2 model configuration is incorrect, causing issues during forward pass
Early stopping score calculators with values thar should be maximized (accuracy, f1 etc) are not working properly (values are minimized not maximized). Workaround: override ScoreCalculator.calculateScore(...) and return 1.0 - super.calculateScore(...).
Adds support for ‘no bias’ layers via hasBias(boolean) config (DenseLayer, EmbeddingLayer, OutputLayer, RnnOutputLayer, CenterLossOutputLayer, ConvolutionLayer, Convolution1DLayer). EmbeddingLayer now defaults to no bias (Link)
Adds support for dilated convolutions (aka ‘atrous’ convolutions) - ConvolutionLayer, SubsamplingLayer, and 1D versions there-of. (Link)
Added support for both iteration-based and epoch-based schedules via ISchedule. Also added support for custom (user defined) schedules
Learning rate schedules are configured on the updaters, via the .updater(IUpdater) method
Added dropout API (IDropout - previously dropout was available but not a class); added Dropout, AlphaDropout (for use with self-normalizing NNs), GaussianDropout (multiplicative), GaussianNoise (additive). Added support for custom dropout types (Link)
Added support for dropout schedules via ISchedule interface (Link)
Added weight/parameter noise API (IWeightNoise interface); added DropConnect and WeightNoise (additive/multiplicative Gaussian noise) implementations (Link); dropconnect and dropout can now be used simultaneously
Adds layer configuration alias .units(int) equivalent to .nOut(int) (Link)
Adds ComputationGraphConfiguration GraphBuilder .layer(String, Layer, String...) alias for .addLayer(String, Layer, String...)
Layer index no longer required for MultiLayerConfiguration ListBuilder (i.e., .list().layer(<layer>) can now be used for configs) (Link)
Added MultiLayerNetwork.summary(InputType) and ComputationGraph.summary(InputType...) methods (shows layer and activation size information) (Link)
MultiLayerNetwork, ComputationGraph and layerwise trainable layers now track the number of epochs (Link)
Added deeplearning4j-ui-standalone module: uber-jar for easy launching of UI server (usage: java -jar deeplearning4j-ui-standalone-1.0.0-alpha.jar -p 9124 -r true -f c:/UIStorage.bin)
Deeplearning4J: API Changes (Transition Guide): 0.9.1 to 1.0.0-alpha
Default training workspace mode has been switched to SEPARATE from NONE for MultiLayerNetwork and ComputationGraph (Link)
Behaviour change: fit(DataSetIterator) and similar methods no longer perform layerwise pretraining followed by backprop - only backprop is performed in these methods. For pretraining, use pretrain(DataSetIterator) and pretrain(MultiDataSetIterator) methods (Link)
Previously deprecated updater configuration methods (.learningRate(double), .momentum(double) etc) all removed
To configure learning rate: use .updater(new Adam(lr)) instead of .updater(Updater.ADAM).learningRate(lr)
To configure bias learning rate: use .biasUpdater(IUpdater) method
To configure learning rate schedules: use .updater(new Adam(ISchedule)) and similar
Updater configuration via enumeration (i.e., .updater(Updater)) has been deprecated; use .updater(IUpdater)
.regularization(boolean) config removed; functionality is now always equivalent to .regularization(true)
.useDropConnect(boolean) removed; use .weightNoise(new DropConnect(double)) instead
.iterations(int) method has been removed (was rarely used and confusing to users)
Multiple utility classes (in org.deeplearning4j.util) have been deprecated and/or moved to nd4j-common. Use same class names in nd4j-common org.nd4j.util instead.
DataSetIterators in DL4J have been moved from deeplearning4j-nn module to new deeplearning4j-datasets, deeplearning4j-datavec-iterators and deeplearning4j-utility-iterators modules. Packages/imports are unchanged; deeplearning4j-core pulls these in as transitive dependencies hence no user changes should be required in most cases (Link)
Previously deprecated .activation(String) has been removed; use .activation(Activation) or .activation(IActivation) instead
Layer API change: Custom layers may need to implement applyConstraints(int iteration, int epoch) method
Parameter initializer API change: Custom parameter initializers may need to implement isWeightParam(String) and isBiasParam(String) methods
RBM (Restricted Boltzmann Machine) layers have been removed entirely. Consider using VariationalAutoencoder layers as a replacement (Link)
GravesBidirectionalLSTM has been deprecated; use new Bidirectional(Bidirectional.Mode.ADD, new GravesLSTM.Builder()....build())) instead
Previously deprecated WordVectorSerializer methods have now been removed (Link)
Removed deeplearning4j-ui-remote-iterationlisteners module and obsolete RemoteConvolutionalIterationListener (Link)
Deeplearning4J: 1.0.0-alpha Known Issues
Performance on some networks types may be reduced on CUDA compared to 0.9.1 (with workspaces configured). This will be addressed in the next release
Some issues have been noted with FP16 support on CUDA (Link)
Keras 2 support, keeping backward compatibility for keras 1
Keras 2 and 1 import use exact same API and are inferred by DL4J
Keras unit test coverage increased by 10x, many more real-world integration tests
Unit tests for importing and checking layer weights
Leaky ReLU, ELU, SELU support for model import
All Keras layers can be imported with optional bias terms
Old deeplearning4j-keras module removed, old “Model” API removed
All Keras initializations (Lecun normal, Lecun uniform, ones, zeros, Orthogonal, VarianceScaling, Constant) supported
1D convolution and pooling supported in DL4J and Keras model import
Atrous Convolution 1D and 2D layers supported in Keras model import
1D Zero padding layers supported
Keras constraints module fully supported in DL4J and model import
Upsampling 1D and 2D layers in DL4J and Keras model import (including GAN examples in tests)
Most merge modes supported in Keras model import, Keras 2 Merge layer API supported
Separable Convolution 2D layer supported in DL4J and Keras model import
Deconvolution 2D layer supported in DL4J and Keras model import
Full support of Keras noise layers on import (Alpha dropout, Gaussian dropout and noise)
Support for SimpleRNN layer in Keras model import
Support for Bidirectional layer wrapper Keras model import
Addition of LastTimestepVertex in DL4J to support return_sequences=False for Keras RNN layers.
DL4J support for recurrent weight initializations and Keras import integration.
SpaceToBatch and BatchToSpace layers in DL4J for better YOLO support, plus end-to-end YOLO Keras import test.
Cropping2D support in DL4J and Keras model import
Deeplearning4J: Keras Import - API Changes (Transition Guide): 0.9.1 to 1.0.0-alpha
In 0.9.1 deprecated Model and ModelConfiguration have been permanently removed. Use KerasModelImport instead, which is now the only entry point for Keras model import.
Deeplearning4J: Keras Import - Known Issues
Embedding layer: In DL4J the output of an embedding layer is 2D by default, unless preprocessors are specified. In Keras the output is always 3D, but depending on specified parameters can be interpreted as 2D. This often leads to difficulties when importing Embedding layers. Many cases have been covered and issues fixed, but inconsistencies remain.
Batchnormalization layer: DL4J’s batch normalization layer is much more restrictive (in a good way) than Keras’ version of it. For instance, DL4J only allows to normalize spatial dimensions for 4D convolutional inputs, while in Keras any axis can be used for normalization. Depending on the dimension ordering (NCHW vs. NHWC) and the specific configuration used by a Keras user, this can lead to expected (!) and unexpected import errors.
Support for importing a Keras model for training purposes in DL4J (enforceTrainingConfig == true) is still very limited and will be tackled properly for the next release.
Keras Merge layers: seem to work fine with the Keras functional API, but have issues when used in a Sequential model.
Reshape layers: can be somewhat unreliable on import. DL4J rarely has a need to explicitly reshape input beyond (inferred) standard input preprocessors. In Keras, Reshape layers are used quite often. Mapping the two paradigms can be difficult in edge cases.
Control flow is supported with IF and WHILE primitives.
Alpha release of SameDiff auto-differentiation engine for ND4J.
Two execution modes available: Java-driven execution, and Native execution for serialized graphs.
SameDiff graphs can be serialized using FlatBuffers
Building and running computation graphs build from SameDiff operations.
Graphs can run forward pass on input data and compute gradients for the backward pass.
Already supports many high-level layers, like dense layers, convolutions (1D-3D) deconvolutions, separable convolutions, pooling and upsampling, batch normalization, local response normalization, LSTMs and GRUs.
In total there are about 350 SameDiff operations available, including many basic operations used in building complex graphs.
Supports rudimentary import of TensorFlow and ONNX graphs for inference.
TFOpTests is a dedicated project for creating test resources for TensorFlow import.
Known Issues and Limitations
Vast majority of new operations added in 1.0.0-alpha do NOT use GPU yet.
While many of the widely used base operations and high-level layers used in practice are supported, op coverage is still limited. Goal is to achieve feature parity with TensorFlow and fully support import for TF graphs.
Some of the existing ops do not have a backward pass implemented (called doDiff in SameDiff).
As per DL4J API changes: Updater configuration options (learning rate, momentum, epsilon, rho etc) have been moved to ParameterSpace instead. Updater spaces (AdamSpace, AdaGradSpace etc) introduced ([Link](https://github.com/deeplearning4j/Arbiter/pull/103))
As per DL4J API changes: Dropout configuration is now via ParameterSpace<IDropout>, DropoutSpace introduced (Link)
Numerical stability improvements to LossMCXENT / LossNegativeLogLikelihood with softmax (should reduce NaNs with very large activations)
Added runtime version checking for ND4J, DL4J, RL4J, Arbiter, DataVec Link
Deeplearning4j: Use of Evaluation class no-arg constructor (i.e., new Evaluation()) can result in accuracy/stats being reported as 0.0. Other Evaluation class constructors, and ComputationGraph/MultiLayerNetwork.evaluate(DataSetIterator) methods work as expected.
This also impacts Spark (distributed) evaluation: workaround is to replace sparkNet.evaluate(testData); with sparkNet.doEvaluation(testData, 64, new Evaluation(10));, where 10 is the number of classes and 64 in the evaluation minibatch size to use.
SequenceRecordReaderDataSetIterator applies preprocessors (such as normalization) twice to each DataSet (possible workaround: use RecordReaderMultiDataSetIterator + MultiDataSetWrapperIterator)
TransferLearning: ComputationGraph may incorrectly apply l1/l2 regularization (defined in FinetuneConfiguration) to frozen layers. Workaround: set 0.0 l1/l2 on FineTuneConfiguration, and required l1/l2 on new/non-frozen layers directly. Note that MultiLayerNetwork with TransferLearning appears to be unaffected.
UI now uses Play framework, integrates with DL4J UI (replaces Dropwizard backend). Dependency issues/clashing versions fixed.
Supports DL4J StatsStorage and StatsStorageRouter mechanisms (FileStatsStorage, Remote UI via RemoveUIStatsStorageRouter)
General UI improvements (additional information, formatting fixes)
0.8.0 -> 0.9.0 Transition Notes
Updater configuration methods such as .momentum(double) and .epsilon(double) have been deprecated. Instead: use .updater(new Nesterovs(0.9)) and .updater(Adam.builder().beta1(0.9).beta2(0.999).build()) etc to configure
CsvRecordReader constructors: now uses characters for delimiters, instead of Strings (i.e., ‘,’ instead of “,”)
Arbiter UI is now a separate module, with Scala version suffixes: arbiter-ui_2.10 and arbiter-ui_2.11
Improved error messages on invalid configuration or data; improved validation on both
Added metadata functionality: track source of data (file, line number, etc) from data import to evaluation. Loading a subset of examples/data from this metadata is now supported. Link
Removed Jackson as core dependency (shaded); users can now use any version of Jackson without issue
Added LossLayer: version of OutputLayer that only applies loss function (unlike OutputLayer: it has no weights/biases)
Functionality required to build triplet embedding model (L2 vertex, LossLayer, Stack/Unstack vertices etc)
Reduced DL4J and ND4J ‘cold start’ initialization/start-up time
Pretrain default changed to false and backprop default changed to true. No longer needed to set these when setting up a network configuration unless defaults need to be changed.
Added TrainingListener interface (extends IterationListener). Provides access to more information/state as network training occurs Link
Numerous bug fixes across DL4J and ND4J
Performance improvements for nd4j-native & nd4j-cuda backends
Standalone Word2Vec/ParagraphVectors overhaul:
ParaVec inference available for both PV-DM & PV-DBOW
Parallel tokenization support was added, to address computation-heavy tokenizers.
Native RNG introduced for better reproducibility within multi-threaded execution environment.
Additional RNG calls added: Nd4j.choice(), and BernoulliDistribution op.
Off-gpu storage introduced, to keep large things, like Word2Vec model in host memory. Available via WordVectorSerializer.loadStaticModel()
Two new options for performance tuning on nd4j-native backend: setTADThreshold(int) & setElementThreshold(int)
0.6.0 -> 0.7.0 Transition Notes
Notable changes for upgrading codebases based on 0.6.0 to 0.7.0:
UI: new UI package name is deeplearning4j-ui_2.10 or deeplearning4j-ui_2.11 (previously: deeplearning4j-ui). Scala version suffix is necessary due to Play framework (written in Scala) being used now.
Histogram and Flow iteration listeners deprecated. They are still functional, but using new UI is recommended Link
DataVec ImageRecordReader: labels are now sorted alphabetically by default before assigning an integer class index to each - previously (0.6.0 and earlier) they were according to file iteration order. Use .setLabels(List) to manually specify the order if required.
CNNs: configuration validation is now less strict. With new ConvolutionMode option, 0.6.0 was equivalent to ‘Strict’ mode, but new default is ‘Truncate’
See ConvolutionMode javadoc for more details: Link
Xavier weight initialization change for CNNs and LSTMs: Xavier now aligns better with original Glorot paper and other libraries. Xavier weight init. equivalent to 0.6.0 is available as XAVIER_LEGACY
DataVec: Custom RecordReader and SequenceRecordReader classes require additional methods, for the new metadata functionality. Refer to existing record reader implementations for how to implement these methods.
Few new builder methods:
Behaviour change: batchSize: now batch size is ALSO used as threshold to execute number of computational batches for sg/cbow