Skip to Content

René-Jean Corneille

I like to write about Data, Machine Learning, Artificial Intelligence and my experience leading technical teams

An inventory of transformers inference optimisation methods in the HuggingFace echosystem

An inventory of transformers inference optimisation methods in the HuggingFace echosystem

Latency is one of the main challenges to making machine learning impactful for an organisation. Depending on the latency requirements and the inference methods, the emphasis on latency can be either about cost efficiency and / or about scalability. Machine Learning Engineering teams need to be able to provide value and