Imaging Grants

Video-based action segmentation by learning world models from language

Status
Ongoing
Supervisor
Prof. Alexander Mathis, Prof. Antoine Bosselut

Many questions in biology, from development to neuroscience and medicine require the identification of finegrained behaviors. Researchers will develop novel computer vision and natural language processing technology to improve behavioral analysis in biology and medicine. In this project, they will provide a rich knowledge graph representations to language and vision models to learn grounded representations of complex animal behavior. They will develop new open information extraction methods for unstructured text that will extract descriptions of physical behavior for animals & humans. Unifying these extracted descriptions by particular species, they will construct fine-grained knowledge graph representations of animal behavior that will serve as a symbolic world model to ground representation learning of video content. 

While these descriptions may only be aligned for certain images, they can be consolidated into a more expressive structured representation of a dynamics model for animals. Using these aligned structures with language descriptions and video content, researchers will use self-supervised training objectives to learn from video content such as nature documentaries and yoga pose videos, which are more semantically-related to animal tracking videos in lab settings. Thus, while current action recognition inference systems in biology [Datta et al. 2019, von Ziegler et al. 2021,Hausmann et al. 2021] do not take priors, language and long-range temporal reasoning into account, this project proposes to develop tri-modal systems that integrate language, video and structured knowledge to advance action recognition. Scientists believe that these models will be able to more robustly and efficiently generalize to various applications in biology.

Contact

A question on the Center? Get in touch!

Contact us if you have any questions or require support in the field of imaging.
Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.
EPFL AVP CP IMAGING
BM 4142 (Bâtiment BM)
Station 17
1015 Lausanne