Machine Learning Docker Template

Summary

The purpose of this post is to propose a template for machine learning projects that strives to follow these principles:

  1. All data scientists can quickly setup an identical development environment based on Docker that encourages good software engineering practices.
  2. Dependency management is handled during the environment’s startup by Miniconda and requires minimal manual changes.
  3. Notebooks are encouraged for exploration. However, for production purposes notebooks must be version controlled, parametrized and run using Papermill.

Code

The template is available on github adamnovotnycom/machine-learning-docker-template. The general template structure looks as follows:

File structure
  1. Dockerfile defines the development environment and uses Miniconda as base image
FROM continuumio/miniconda3
...
RUN conda env create -f conda.yml
RUN echo "source activate dev" > ~/.bashrc

--

--

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store