Comparison of MLBench and MLPerf

MLPerf is a broad benchmark suite for measuring the performance of machine learning (ML) software frameworks, ML hardware platforms and ML cloud platforms.

In this post, we will highlight the main differences between MLBench and MLPerf.

Read more

Communication Backends, Raw performance benchmarking

Distributed learning requires workers to collaborate by swiftly sharing learned information with their “colleagues”. With the accelerating growth of model sizes in modern deep learning, this aspect gains even more importance.

MLBench supports both one and multiple processes per node, in addition to multi-node training. Communication between workers is crucial and will heavily affect performance, notably for communication bound training algorithms.

This blog post addresses and analyzes the raw performance of different communication backends on commodity communication hardware, used to transmit large arrays or tensors.

Read more

Introducing MLBench

MLBench is a framework for distributed machine learning. Its purpose is to improve transparency, reproducibility, robustness, and to provide fair performance measures as well as reference implementations, helping adoption of distributed machine learning methods both in industry and in the academic community.

The MLBench Dashboard

Read more