BigCode – Training large language models in an open and responsible way
Leandro von Werra
In this presentation, Leandro will share several accomplishments of the BigCode project, an open-scientific collaboration working on the responsible development and use of LLMs for code generation. These include: * The StarCoder models: 15.5B parameter models with an 8K context length, fill-in-the-middle, and multi-query attention. * The Stack, 6.4 TB of permissively licensed source code with inspection tool and opt-out mechanism * Novel insights on the Chinchilla scaling laws, suggesting we haven’t reached the limit of training smaller LLMs for longer.
Leandro von Werra is a machine learning engineer in the open source and research teams at Hugging Face. He is the creator of a popular Python library called TRL, which combines transformers with reinforcement learning. Furthermore, he co-leads the BigCode project that aims at developing large language models for code in an open and responsible way with models such as StarCoder.
Time: 13.00 – 14.30
Date: Tuesday 14 November