Contact Us

I-X Seminar Series: Training large language models in an open and responsible way with Leandro von Werra

Key Details:

Time: 13.00 – 14.30
Date: Tuesday 14 November
Location: Livestreamed

Registration is
now closed

Speaker

Leandro von Werra

Leandro von Werra is a machine learning engineer in the open source and research teams at Hugging Face. He is the creator of a popular Python library called TRL, which combines transformers with reinforcement learning. Furthermore, he co-leads the BigCode project that aims at developing large language models for code in an open and responsible way with models such as StarCoder.

Talk Title

BigCode – Training large language models in an open and responsible way

Talk Summary

In this presentation, Leandro will share several accomplishments of the BigCode project, an open-scientific collaboration working on the responsible development and use of LLMs for code generation. These include: * The StarCoder models: 15.5B parameter models with an 8K context length, fill-in-the-middle, and multi-query attention. * The Stack, 6.4 TB of permissively licensed source code with inspection tool and opt-out mechanism * Novel insights on the Chinchilla scaling laws, suggesting we haven’t reached the limit of training smaller LLMs for longer.

More Events

Jan
13

This workshop aims to bring together researchers in stochastic analysis, statistics and theoretical machine learning for an exchange of ideas at the forefront of the field. The

Jan
08

Join the winter edition of Multi-Service Networks workshop, which will cover all aspects of networked systems.

Jan
08

In his Inaugural Lecture, Professor Hamed Haddadi discusses his academic journey towards building networked systems.