MLBench: Distributed Machine Learning Benchmark

MLBench is a framework for distributed machine learning. Its purpose is to improve transparency, reproducibility, robustness, and to provide fair performance measures as well as reference implementations, helping adoption of distributed machine learning methods both in industry and in the academic community.

MLBench is public, open source and vendor independent, and has two main goals:

to be an easy-to-use and fair benchmarking suite for algorithms as well as for systems (software frameworks and hardware).
to provide re-usable and reliable reference implementations of distributed ML algorithms.

For more details on the benchmarking tasks, see Benchmark Tasks and Benchmark Results

Check out our blog!

Resources and Community:

Github: github.com/mlbench
Documentation: mlbench.readthedocs.io
Mailing list: groups.google.com/d/forum/mlbench
Slack channel: join.slack.com/t/mlbench/shared_invite/zt-6sznc8fa-_diIdB7~XtLYmCLaQuOA9Q

Features

For reproducibility and simplicity, we currently focus on standard supervised ML, including standard deep learning tasks as well as classic linear ML models.
We provide reference implementations for each algorithm and task, to make it easy to port to a new framework.
Our goal is to benchmark all/most currently relevant distributed execution frameworks. We welcome contributions of new frameworks in the benchmark suite.
We provide precisely defined tasks and datasets to have a fair and precise comparison of all algorithms, frameworks and hardware.
Independently of all solver implementations, we provide universal evaluation code allowing to compare the result metrics of different solvers and frameworks.
Our benchmark code is easy to run on public clouds.

MLBench

MLBench: Distributed Machine Learning Benchmark

Resources and Community:

Features

Sponsors