ck Extending MLCommons MIL (Modular C++ Inference Library) to support BERT/RetinaNet, TensorRT and network division

The MLCommons Task Force on Automation and Reproducibility has increasing number of requests to extend MLCommons Modular C++ inference library (MIL) to support BERT, TensorRT and network division - let's discuss at some point if/how we can do it ...

Oct 17 '23 16:10 gfursin

Hi @nv-jinhosuh . Will you be interested to discuss how to extend/improve the MLCommons CM/MIL technology to provide a common automation interface for all MLPerf inference implementations and develop a modular C++ harness, please? Your feedback will help the community guide further developments! Thank you very much!

Oct 30 '23 14:10 gfursin

Hi @gfursin Yes of course. If I remember correctly, Ethan was attending the CM/MIL meetings; let me double check with him so I can be aligned with him.

Oct 30 '23 14:10 nv-jinhosuh

Hi @gfursin Yes of course. If I remember correctly, Ethan was attending the CM/MIL meetings; let me double check with him so I can be aligned with him.

Great! Thank you very much @nv-jinhosuh !

Ethan is regularly participating in our meetings where we are currently focusing on how to provide a common CM interface for different MLPerf implementations from different vendors.

Since CM works fine now and we have a good coverage for MLPerf inference, we started getting requests to go deeper and start extending the universal C++ harness for MLPerf inference (MLCommons MIL) to support more frameworks and backends. It was originally prototyped more than 1 year ago by @hanwenzhu (Thomas Zhu - student from Oxford University and former intern of Vijay Janapa Reddi, MLCommons co-founder) and connected with our universal CM interface for MLPerf, but we didn't have further requests to use or improve it...

However, it seems that there is a revival of interest towards such a relatively simple, universal and performant C++ harness and we would like to prepare a roadmap for collaborative developments within MLCommons to extend it - should we discuss it with Ethan too? It seems that you are also involved in the development of C++ harnesses for MLPerf? The idea is to get as much feedback from MLCommons members as possible before extending this community project ...

Thanks a lot again!

Oct 30 '23 14:10 gfursin

Thanks @gfursin - I checked with Ethan. Let us discuss altogether. @nv-etcheng FYI

Oct 30 '23 20:10 nv-jinhosuh

Great! Thank you @nv-jinhosuh! We can use a Thursday timeslot at 11am PT (https://meet.google.com/acj-srsg-thq) or set up a separate conf-call ... What do you prefer?

Oct 30 '23 20:10 gfursin

Hi Grigori, I do have a conflict on that timeslot, and Ethan will be out on Thursday. If you don't mind could we do this on Friday instead?

Oct 30 '23 21:10 nv-jinhosuh

Hi Grigori, I do have a conflict on that timeslot, and Ethan will be out on Thursday. If you don't mind could we do this on Friday instead?

Sure. Do you have a preferred time slot between 9-11am PT this Friday? Maybe it's easier if you send a conf-call invite to me and Arjun with your preferred time directly via [email protected] and [email protected] ? Thanks a lot and looking forward to talking to you soon!

Oct 31 '23 12:10 gfursin

Thanks Grigori. Will do that. Thanks!

Oct 31 '23 13:10 nv-jinhosuh