SAN JOSE, Calif. — Google and Baidu collaborated with researchers at Harvard and Stanford to define a suite of benchmarks for machine learning. So far, AMD, Intel, two AI startups, and two other universities have expressed support for MLPerf, an initial version of which will be ready for use in August.
Today’s hardware falls far short of running neural-networking jobs at the performance levels desired. A flood of new accelerators are coming to market, but the industry lacks ways to measure them.
To fill the gap, the first release of MLPerf will focus on training jobs on a range of systems from workstations to large data centers, a big pain point for web giants such as Baidu and Google. Later releases will expand to include inference jobs, eventually extended to include ones run on embedded client systems.
“To train one model we really want to run would take all GPUs we have for two years,” given the size of the model and its data sets, said Greg Diamos, a senior researcher in Baidu’s deep-learning group, giving an example of the issue for web giants.
“If systems become faster, we can unlock the potential of machine learning a lot quicker,” said Peter Mattson, a staff engineer on the Google Brain project who announced MLPerf at a May 2 event.
An early version of the suite running on a variety of AI frameworks will be ready to run in about three months. At that time, organizers aim to convene a working group to flesh out a more complete version.
“We’re initially calling it a version 0.5 release … we did this with a small team, and now we want the community to put its stamp on a version 1.0 to be something everyone owns,” said Mattson. “We encourage feedback … to suggest workloads, benchmark definitions, and results so we can rapidly iterate” the benchmark.
Continue to page two on Embedded's sister site, EE Times: “AI gets new benchmark.”