I-X Seminar Series: Sequential Modeling Enables Scalable Learning for Large Vision Models with Yutong Bai
16/01/2024
16.00 - 17.00In this presentation, Yutong will introduce a novel sequential modeling approach which enables learning a Large Vision Model (LVM) without making use of any linguistic data. To do this, she will define a common format, “visual sentences”, in which she can represent raw images and videos as well as annotated data sources such as semantic segmentations and depth reconstructions without needing any meta-knowledge beyond the pixels.