Machine Learning Server for Inference in Production

Deeplearning4j serves machine-learning models for inference in production using the free community edition of SKIL, the Skymind Intelligence Layer CE.

A model server serves the parametric machine-learning models that makes decisions about data. It is used for the inference stage of a machine-learning workflow, after data pipelines and model training. A model server is the tool that allows data science research to be deployed in a real-world production environment.

What a Web server is to the Internet, a model server is to AI. Where a Web server receives an HTTP request and returns data about a Web site, a model server receives data, and returns a decision or prediction about that data: e.g. sent an image, a model server might return a label for that image, identifying faces or animals in photographs. SKIL is like Apache Web server, and a machine learning model in this analogy is like a PHP file. The model is just a matrix with a bunch of weights. You want to put the machine-learning model on a server and access it from other locations. Just like a PHP file, it has source code, and to put it on the Internet, you put in on a Web server.

Alt text

The SKIL machine learning model server is able to import models from Python frameworks such as Tensorflow, Keras, Theano and CNTK, overcoming a major barrier in deploying machine learning models to production environments.

Production-grade model servers have a few important features. They should be:

  • Secure. They may process sensitive data.
  • Scalable. That data traffic may surge, and predictions should be made with low latency.
  • Stable and debuggable. SKIL is based on the enterprise-hardened JVM.
  • Certified. Deeplearning4j works with CDH and HDP.

Skymind Intelligence Layer (SKIL)

SKIL meets all of those criteria. Visit SKIL’s Machine Learning Model Server Quickstart to test it out. Briefly, SKIL is a:

Machine Learning Solution Platform

  • ETL: Build data pipelines with Pandas, DataVec – Persistent and reusable ETL
  • Training – Spark coordinates work over multiple GPUs and CPUs – Recurrent updates of models
  • Inference – One-click AI model deployment – Robust, fault-tolerant, load-balanced, auto-elastically scales – Serves any model specified in PMML

Tool Aggregator

  • Python: Tensorflow, Keras, scikit-learn, Pytorch, Numpy, Pandas
  • Java/Scala: Deeplearning4j, ND4J, DataVec, SMILE

Resource Portal

  • Solves infrastructure problems for data scientists automatically
  • Multi-cloud and Hybrid
  • Point and shoot: Data scientists use whichever servers are available
  • On-prem: Integrations with Hadoop, Spark, Kafka, ElasticSearch, Cassandra
  • Public Cloud: AWS, Azure, Google Cloud

Machine Learning Model Server

  • Model management & monitoring
  • Performance tracking - champion and challenger ranking
  • Collaborative workspace – Clone experiments – Track progress
  • Auditing: which data and users touched a model?
  • High uptime (backed by an SLA)
  • Rollbacks
  • A/B Testing (2018)

SKIL is enterprise tested. Skymind’s clients include the US Department of Homeland Security, Softbank, France Telecom and Ericsson, among others.

Chat with us on Gitter