Yana Hasson

Research Scientist, Google DeepMind

I am a Research Scientist at Google DeepMind . My research focuses on building AI systems for positive-impact applications, most recently climate modeling. Before that, I was a PhD student in the computer vision and machine learning research laboratory (WILLOW project team) in the Department of Computer Science of École Normale Supérieure (ENS) and in Inria Paris where I worked on understanding first person videos under the supervision of Ivan Laptev and Cordelia Schmid. I have received a MS degree in Applied Mathematics from École Centrale Paris and a MS degree in Mathematics, Vision and Learning from ENS Paris-Saclay.


News

2026
We contributed an AI atmospheric model entry (ArchesWeatherGen) to the AI Model Intercomparison Project (AIMIP) Phase 1, a systematic evaluation framework for AI weather and climate models.
2025
2025
2024
Our paper Scaling 4D Representations is out on arXiv.
2024
Neural Compression of Atmospheric States which focuses on compression of weather and climate data is out on arXiv.
2024
Contributed to Gemini: a family of highly capable multimodal models.
05 / 2022
10 / 2022
New paper first-authored by Zerui Chen accepted at ECCV 2022 on hand + object shape and pose estimation.
11 / 2021
I joined DeepMind as a Research Scientist.
10 / 2021
I defended my PhD ! [thesis]
08 / 2021
I am attending the 2021 CEMRACS summer program on Data Assimilation and Reduced Modeling for High Dimensional Problems in Luminy.
08 / 2021
Outstanding reviewer for ICCV 2021.
08 / 2021
New paper on arXiv on joint hand-object fitting to noisy evidence in short RGB clips.
09 / 2020
I gave a talk to present our work on hand-object reconstruction at the CS Robotics Reading group, at the University of Toronto .
08 / 2020
We held the WiCV workshop at ECCV'20 !
08 / 2020
I completed a 3-month research internship at Facebook AI Research in Paris
06 / 2020
06 / 2020
I completed a 6-month research internship at Google
03 / 2020
CVPR'20 paper accepted. Code coming soon !
10 / 2019
I presented at the Mixed Reality & AI Zurich Lab Workshop during my internship at Microsoft.
02 / 2019
CVPR'19 paper accepted on hand-object reconstruction!
09 / 2018
I coorganized the 5th WiCV workshop which took place in conjunction with ECCV'18 in Munich.
04 / 2018
I visited the Perceiving Systems team at MPI for a month.
11 / 2017
I started my PhD at WILLOW.
05 / 2017
I joined WILLOW project team as a research intern!

Research

AIMIP Phase 1: Systematic Evaluations of AI Weather and Climate Models
Brian Henn, Christopher S. Bretherton, Nikolay Koldunov, Christian Lessig, Maria J. Molina, Troy Arcomano, Oliver Watt-Meyer, Guillaume Couairon, Renu Singh, Robert Brunstein, Yana Hasson, Antonia Jost, Noah Brenowitz, Peter Manshausen, Nathaniel Cresswell-Clay, Dale Durran, Kyle Joseph Chen Hall, Janni Yuval, Dmitrii Kochkov, Stephan Hoyer, and Ignacio Lopez-Gomez
arXiv preprint, 2026.

	@article{henn2026aimip,
		 title={AIMIP Phase 1: Systematic Evaluations of AI Weather and Climate Models},
		 author={Henn, Brian and Bretherton, Christopher S. and Koldunov, Nikolay and Lessig, Christian and Molina, Maria J. and Arcomano, Troy and Watt-Meyer, Oliver and Couairon, Guillaume and Singh, Renu and Brunstein, Robert and Hasson, Yana and Jost, Antonia and Brenowitz, Noah and Manshausen, Peter and Cresswell-Clay, Nathaniel and Durran, Dale and Hall, Kyle Joseph Chen and Yuval, Janni and Kochkov, Dmitrii and Hoyer, Stephan and Lopez-Gomez, Ignacio},
		 journal={arXiv preprint arXiv:2605.06944},
		 year={2026}
		}
								

The AI weather and climate model intercomparison project (AIMIP) establishes a standardized framework for evaluating AI-based weather and climate models. Phase 1 assesses participating models trained on ERA5 reanalysis data, tasked with simulating the atmosphere from 1979 to 2024. Models are evaluated across biases, trends, El Niño responses, temporal variability, and out-of-sample generalization.

Evaluating Skill and Stability of ArchesWeather and ArchesWeatherGen under Multi-Decadal Climate Simulations
Renu Singh, Robert Brunstein, Antonia Jost, Thomas Rackow, Claire Monteleoni, Yana Hasson, Christian Lessig, and Guillaume Couairon
arXiv preprint, 2026.

	@article{singh2026archesweather,
		 title={Evaluating Skill and Stability of ArchesWeather and ArchesWeatherGen under Multi-Decadal Climate Simulations},
		 author={Singh, Renu and Brunstein, Robert and Jost, Antonia and Rackow, Thomas and Monteleoni, Claire and Hasson, Yana and Lessig, Christian and Couairon, Guillaume},
		 journal={arXiv preprint arXiv:2605.29976},
		 year={2026}
		}
								

We evaluate ArchesWeather (deterministic) and ArchesWeatherGen (probabilistic flow-matching) when adapted for multi-decadal climate simulation following the AIMIP Phase 1 protocol. Despite their origins in short-term weather forecasting, both models produce stable long-term climate simulations, maintaining a stable annual cycle, capturing climate variable drift, and faithfully reproducing ERA5 climatology, large-scale circulations, and interannual variability.

Flamingo: a Visual Language Model for Few-Shot Learning
NeurIPS, 2022.

	@article{alayrac2022flamingo,
		 title={Flamingo: a Visual Language Model for Few-Shot Learning},
		 author = {Alayrac, Jean-Baptiste and Donahue, Jeff and Luc, Pauline and Miech, Antoine and Barr, Iain and Hasson, Yana and Lenc, Karel and Mensch, Arthur and Millican, Katie and Reynolds, Malcolm and Ring, Roman and Rutherford, Eliza and Cabi, Serkan and Han, Tengda and Gong, Zhitao and Samangooei, Sina and Monteiro, Marianne and Menick, Jacob and Borgeaud, Sebastian and Brock, Andrew and Nematzadeh, Aida and Sharifzadeh, Sahand and Binkowski, Mikolaj and Barreira, Ricardo and Vinyals, Oriol and Zisserman, Andrew and Simonyan, Karen},
		 journal={arXiv preprint arXiv:2204.14198},
		 year={2022}
		}
								
Towards unconstrained joint hand-object reconstruction from RGB videos
Yana Hasson, Gül Varol, Ivan Laptev and Cordelia Schmid
3DV, 2021.
@article{{hasson20_homan,
			title     = {Towards unconstrained joint hand-object reconstruction from RGB videos},
			author    = {Hasson, Yana and Varol, G"{u}l and Laptev, Ivan and Schmid, Cordelia},
			journal  ={arXiv preprint arXiv:2108.07044},
			year      = {2021}
	}

Our work aims to obtain 3D reconstruction of hands and manipulated objects from monocular videos. Reconstructing hand-object manipulations holds a great potential for robotics and learning from human demonstrations. The supervised learning approach to this problem, however, requires 3D supervision and remains limited to constrained laboratory settings and simulators for which 3D ground truth is available. In this paper we first propose a learning-free fitting approach for hand-object reconstruction which can seamlessly handle two-hand object interactions. Our method relies on cues obtained with common methods for object detection, hand pose estimation and instance segmentation. We quantitatively evaluate our approach and show that it can be applied to datasets with varying levels of difficulty for which training data is unavailable.

Low Bandwidth Video-Chat Compression using Deep Generative Models
Maxime Oquab, Pierre Stock, Oran Gafni, Daniel Haziza, Tao Xu, Peizhao Zhang, Onur Celebi, Yana Hasson, Patrick Labatut, Bobo Bose-Kolanu, Thibault Peyronnel, Camille Couprie

To unlock video chat for hundreds of millions of people hindered by poor connectivity or unaffordable data costs, we propose to authentically reconstruct faces on the receiver's device using facial landmarks extracted at the sender's side and transmitted over the network. In this context, we discuss and evaluate the benefits and disadvantages of several deep adversarial approaches. In particular, we explore quality and bandwidth trade-offs for approaches based on static landmarks, dynamic landmarks or segmentation maps. We design a mobile-compatible architecture based on the first order animation model of Siarohin et al. In addition, we leverage SPADE blocks to refine results in important areas such as the eyes and lips. We compress the networks down to about 3MB, allowing models to run in real time on iPhone 8 (CPU). This approach enables video calling at a few kbits per second, an order of magnitude lower than currently available alternatives.

Leveraging Photometric Consistency over Time for Sparsely Supervised Hand-Object Reconstruction
Yana Hasson, Bugra Tekin, Federica Bogo, Ivan Laptev, Marc Pollefeys, and Cordelia Schmid
CVPR, 2020.
@INPROCEEDINGS{hasson20_handobjectconsist,
			title     = {Leveraging Photometric Consistency over Time for Sparsely Supervised Hand-Object Reconstruction},
			author    = {Hasson, Yana and Tekin, Bugra and Bogo, Federica and Laptev, Ivan and Pollefeys, Marc and Schmid, Cordelia},
			booktitle = {CVPR},
			year      = {2020}
	}

Modeling hand-object manipulations is essential for understanding how humans interact with their environment. While of practical importance, estimating the pose of hands and objects during interactions is challenging due to the large mutual occlusions that occur during manipulation. Recent efforts have been directed towards fully-supervised methods that require large amounts of labeled training samples. Collecting 3D ground-truth data for hand-object interactions, however, is costly, tedious, and error-prone. To overcome this challenge we present a method to leverage photometric consistency across time when annotations are only available for a sparse subset of frames in a video. Our model is trained end-to-end on color images to jointly reconstruct hands and objects in 3D by inferring their poses. Given our estimated reconstructions, we differentiably render the optical flow between pairs of adjacent images and use it within the network to warp one frame to another. We then apply a self-supervised photometric loss that relies on the visual consistency between nearby images. We achieve state-of-the-art results on 3D hand-object reconstruction benchmarks and demonstrate that our approach allows us to improve the pose estimation accuracy by leveraging information from neighboring frames in low-data regimes.

Learning joint reconstruction of hands and manipulated objects
Yana Hasson, Gül Varol, Dimitrios Tzionas, Igor Kalevatykh, Michael J. Black, Ivan Laptev, and Cordelia Schmid
CVPR, 2019.
@INPROCEEDINGS{hasson19_obman,
	title     = {Learning joint reconstruction of hands and manipulated objects},
	author    = {Hasson, Yana and Varol, G{\"u}l and Tzionas, Dimitrios and Kalevatykh, Igor and Black, Michael J. and Laptev, Ivan and Schmid, Cordelia},
	booktitle = {CVPR},
	year      = {2019}
}
Estimating hand-object manipulations is essential for interpreting and imitating human actions. Previous work has made significant progress towards reconstruction of hand poses and object shapes in isolation. Yet, reconstructing hands and objects during manipulation is a more challenging task due to significant occlusions of both the hand and object. While presenting challenges, manipulations may also simplify the problem since the physics of contact restricts the space of valid hand-object configurations. For example, during manipulation, the hand and object should be in contact but not interpenetrate. In this work we regularize the joint reconstruction of hands and objects with manipulation constraints. We present an end-to-end learnable model that exploits a novel contact loss that favors physically plausible hand-object constellations. To train and evaluate the model, we also propose a new large-scale synthetic dataset, ObMan, with hand-object manipulations. Our approach significantly improves grasp quality metrics over baselines on synthetic and real datasets, using RGB images as input.

Code

geoarches
Star
Open-source library for training, running, and evaluating ML models on geospatial, weather, and climate data.
manopth
Star
Port of MANO differentiable hand as a PyTorch differentiable layer.
kinetics_i3d_pytorch
Star
Port of I3D network for action recognition to PyTorch. Transfer of weights trained on Kinetics dataset.
torch_videovision
Star
Utilities for video data-augmentation.
inflated_convnets_pytorch
Star
Inflation from image input to video inputs of ResNets and DenseNets. Weights initialized based on ImageNet.
useful-computer-vision-phd-resources
Star
Some useful tips and resources for PhDs in computer vision
handobjectconsist
Star
[cvpr20] Demo and training code for sparsely-supervised Hand-Object reconstruction with photometric supervision
obman_train
Star
[cvpr19] Demo, evaluation and training code for Hand-Object reconstruction
homan
Star
Hand-Object joint fitting to noisy evidence

Miscellaneous

Podcasts I listen to

Articles I enjoyed and/or learned from

Books