Matej Gazda Matej Gazda Matej Gazda
  • Writing
  • Publications
  • Projects
  • About

Paper Map: NeurIPS / CVPR / ICLR / ICML 2024–2025

tools
literature
visualization
Interactive 2D map of 26,741 accepted papers from the top ML/CV venues.
Author

Matej Gazda

Published

May 14, 2026

Most of AI research in 2024–2025 is LLMs and diffusion. The imbalance is bigger than I expected.

I pulled the accepted-paper lists from CVPR, NeurIPS, ICML, and ICLR for 2024 and 2025 (main tracks only, no workshops). For each paper I took the title and abstract, embedded them, projected to 2D with UMAP, then ran HDBSCAN to find clusters. Cluster names come from the most distinctive words in the titles. 26,741 papers, 509 clusters. Snapshot from May 2026.

Every dot is a paper. Nearby dots are about similar topics. Hover for the title, click the venue buttons to filter, zoom with the mouse. Full screen view: papers_map.html.

What’s actually big

The biggest clusters each sit between 200 and 300 papers. Grouped roughly:

Language

  • RLHF and preference alignment
  • LLM agents and code generation
  • LLM reasoning, math, chain-of-thought
  • In-context learning, induction heads
  • LoRA and parameter-efficient fine-tuning

Vision

  • Video understanding and temporal grounding
  • Human motion and hand interaction
  • Text-to-image generation
  • 3D Gaussian splatting
  • Diffusion sampling, consistency models, flow matching

Methods, RL, theory

  • Continual / class-incremental learning
  • Linear attention and state-space models (Mamba etc.)
  • Federated learning
  • World models and goal-conditioned RL
  • Multi-agent RL, offline RL
  • Differential privacy, membership inference attacks
  • Conformal prediction, spiking neural networks
  • Molecular ML and drug discovery

Crowded but a step down: time-series forecasting, point cloud segmentation, LiDAR/radar detection, pruning and sparsity. There’s also a surprisingly fat cluster around grokking, two-layer networks, and Kolmogorov-Arnold things, which I read as the small-theory subfield being healthier than people give it credit for.

Where almost nobody is

Tiny clusters, 7-10 papers each, the kind of niche where you can read everything in a weekend:

  • Text-to-SQL
  • Event-based vision and depth
  • Face restoration and rigging
  • Protein conformational dynamics
  • Link prediction
  • DNA regulatory sequence design
  • Counterfactual generation
  • Self-supervised equivariance

A few subfields barely register at these four venues. Event cameras and DNA sequence design are good examples. Either the community is publishing at specialized venues (NeurIPS Datasets & Benchmarks, MICCAI, ISMB, CVPR workshops), or there genuinely isn’t enough work to fill a session.

How I’d use it

  • If you’re about to claim your idea is new, look up where it would land on the map and read the closest five papers first. Cheaper than finding out during rebuttal.
  • If you’re picking a PhD topic, look for small islands sitting next to a big cluster. That’s usually a gap with a path back to the mainline.
  • If you’re reviewing, the map is a fast sanity check on the “underexplored” framing in someone’s introduction.

Caveats

Embedding clusters reflect title and abstract wording. Two papers can sit in the same cluster and be doing completely different things, just because they share buzzwords. Two papers can sit far apart and be closer in practice than the map suggests. Use this as a starting point, not as a literature review.

Missing on purpose: workshops, datasets-and-benchmarks tracks, ACL, EMNLP, MICCAI, ISBI. Medical imaging venues are next on my list.

If your cluster looks wrong or you spot a paper that’s clearly mislabeled, ping me on GitHub. I’ll rerun it.

Back to top

© 2026 Matej Gazda

Built with Quarto. Source on GitHub.