Summary

I did my PhD from 2020 to 2023 at the École des Ponts in the computer vision team IMAGINE (LIGM, École des Ponts, Univ Gustave Eiffel, CNRS) and at the IGN—the French Mapping Agency—in the Spatio-Temporal Structures for Spatial Analysis team of the LASTIG (LASTIG, Univ Gustave Eiffel, IGN/ENSG), advised by Mathieu Aubry and Loïc Landrieu. I am interested in optimisation and machine learning for 3D data, with a focus on unsupervised learning, interpretability, and real-time applications.

🎉 News 🎉

🎤 Oral | 🎓 Tutorial | 🖼 Poster | 👨‍🏫 Invited talk

📜 Publications 📜

teaser.png
Learnable Earth Parser: Discovering 3D Prototypes in Aerial Scans, CVPR, 2024
Romain Loiseau, Elliot Vincent, Mathieu Aubry, Loïc Landrieu
Paper | Webpage | Code | Download EarthParserDataset | Dataset's Toolbox

We propose an unsupervised method for parsing large 3D scans of real-world scenes into interpretable parts. Our goal is to provide a practical tool for analyzing 3D scenes with unique characteristics in the context of aerial surveying and mapping, without relying on application- specific user annotations. Our method offers significant advantage over existing approaches, as it does not require any manual annotations, making it a practical and efficient tool for 3D scene analysis.

teaser.png
OpenStreetView-5M: The Many Roads to Global Visual Geolocation, CVPR, 2024
Guillaume Astruc, Nicolas Dufour, Ioannis Siglidis, Constantin Aronssohn, Nacim Bouia, Stephanie Fu, Romain Loiseau, Van Nguyen Nguyen, Charles Raude, Elliot Vincent, Lintao XU, Hongyu Zhou, Loïc Landrieu
Play | Paper | Webpage | Code | Data

Determining the location of an image anywhere on Earth is a complex visual task, which makes it particularly relevant for evaluating computer vision algorithms. Yet, the absence of standard, large-scale, open-access datasets with reliably localizable images has limited its potential. To address this issue, we introduce OpenStreetView-5M, a large-scale, open-access dataset comprising over 5.1 million geo-referenced street view images. To demonstrate the utility of our dataset, we conduct an extensive benchmark of various state-of-the-art image encoders, spatial representations, and training strategies.

helixnet.jpg
Online Segmentation of LiDAR Sequences: Dataset and Algorithm, ECCV, 2022
Romain Loiseau, Mathieu Aubry, Loïc Landrieu
Paper | Webpage | Helix4D Implementation | Download HelixNet | HelixNet Toolbox

First, we introduce HelixNet, a 10-billion point dataset with fine-grained timestamps and sensor rotation information. Second, we propose Helix4D, a compact and efficient spatio-temporal transformer architecture specifically designed for rotating LiDAR sequences that reaches accuracy on par with the best segmentation algorithms with a reduction of over 5× in terms of latency and 50× in model size.

a-model-you-can-hear.jpg
A Model You Can Hear: Audio Identification with Playable Prototypes, ISMIR, 2022
Romain Loiseau, Baptiste Bouvier, Yann Teytaut, Elliot Vincent, Mathieu Aubry, Loïc Landrieu
Paper | Webpage | Code

We propose an audio identification model based on learnable spectral prototypes. Our model can be trained with or without supervision and reaches state-of-the-art results for speaker and instrument identification, while remaining easily interpretable.

dti3d.jpg
Representing Shape Collections with Alignment-Aware Linear Models, 3DV, 2021
Romain Loiseau, Tom Monnier, Mathieu Aubry, Loïc Landrieu
Paper | Webpage | Code | Slides | Long video | Short video

We characterize 3D shapes as affine transformations of linear families learned without supervision, and showcase its advantages on large shape collections.

This work is an extension of the Deep Transformation-Invariant Clustering framework from Tom Monnier et al. for 3D tasks such as clustering and few-shot segmentation.

💼 Short Resume 💼

2023- I moved to a position in the French Ministry of Economy and Finance, and continue to pursue some research activities as associate researcher at the LIGM
2020-2023 PhD student on "Semantic segmentation of dynamic 3D point clouds" supervised by Mathieu Aubry and Loic Landrieu.
Summer 2019 Research intern on generative adversarial network for MRI spine labelling at GE Healthcare supervised by Mathieu Aubry.
2018-2019 Mathematics, Vision and Learning (MVA) master of the ENS-Cachan & engineering program of Ecole des Ponts ParisTech.
Summer 2018 Research intern on the semantic segmentation of 3D point clouds using deep learning at Bentley Systems with Renaud Keriven.
2015-2018 Master in Computer science and biology at the Ecole polytechnique.

You can find my detailled resume here.

LASTIG ML/CV reading group

For the year 2022-2023, I took over the LASTIG ML/CV reading group. The list of the talks can be found here: romainloiseau.fr/lastig-reading-group/