Contact Us

I-X Seminar Series: Training large language models in an open and responsible way with Leandro von Werra

Key Details:

Time: 13.00 – 14.30
Date: Tuesday 14 November
Location: Livestreamed

Registration is
now closed

Speaker

Leandro von Werra

Leandro von Werra is a machine learning engineer in the open source and research teams at Hugging Face. He is the creator of a popular Python library called TRL, which combines transformers with reinforcement learning. Furthermore, he co-leads the BigCode project that aims at developing large language models for code in an open and responsible way with models such as StarCoder.

Talk Title

BigCode – Training large language models in an open and responsible way

Talk Summary

In this presentation, Leandro will share several accomplishments of the BigCode project, an open-scientific collaboration working on the responsible development and use of LLMs for code generation. These include: * The StarCoder models: 15.5B parameter models with an 8K context length, fill-in-the-middle, and multi-query attention. * The Stack, 6.4 TB of permissively licensed source code with inspection tool and opt-out mechanism * Novel insights on the Chinchilla scaling laws, suggesting we haven’t reached the limit of training smaller LLMs for longer.

More Events

Dec
12

This talk discusses sparse Principal Component Analysis (PCA) with Multiple Components.

Dec
05

In this talk, Dr Kanta Dihal explores differences in cultural approaches towards AI.

Jan
08

In his Inaugural Lecture, Professor Hamed Haddadi discusses his academic journey towards building networked systems.