Leon Lang
Leon Lang
Home
Publications
Blog
Contact
Light
Dark
Automatic
Publications
Type
Conference paper
Journal article
Preprint
Date
2025
2024
2022
2021
Modeling Human Beliefs about AI Behavior for Scalable Oversight
We explain how modeling human evaluator beliefs about AI behavior can help to better interpret their feedback.
Leon Lang
,
Patrick Forré
Cite
arXiv
Factored space models: Towards causality between levels of abstraction
We develop a new foundation for a theory of causality, based on factored space models
Scott Garrabrant
,
Matthias Georg Mayer
,
Magdalena Wache
,
Leon Lang
,
Sam Eisenstat
,
Holger Dell
Cite
arXiv
The Perils of Optimizing Learned Reward Functions: Low Training Error Does Not Guarantee Low Regret
We theoretically analyze to what extent an error in a learned reward function translates into regret of resulting policies
Lukas Fluri
,
Leon Lang
,
Allesandro Abate
,
Patrick Forré
,
David Krueger
,
Joar Skalse
Cite
arXiv
When Your AIs Deceive You: Challenges of Partial Observability in Reinforcement Learning from Human Feedback
We theoretically and empirically study safety issues of using RLHF with human evaluators that have limited information
Leon Lang
,
Davis Foote
,
Stuart Russell
,
Anca Dragan
,
Erik Jenner
,
Scott Emmons
Cite
arXiv
Reviews
Video
Blogpost
Podcast
TAIS 2024
Poster
Abstract Markov Random Fields
We use the recently generalized Hu Theorem to develop a theory of purely abstract Markov random fields.
Leon Lang
,
Clélia de Mulatier
,
Rick Quax
,
Patrick Forré
Cite
arXiv
Information Decomposition Diagrams Applied beyond Shannon Entropy: A Generalization of Hu's Theorem
We generalize information diagrams to functions beyond Shannon entropy, including Kolmogorov complexity and the generalization error from machine learning.
Leon Lang
,
Pierre Baudot
,
Rick Quax
,
Patrick Forré
Cite
arXiv
Compositionality
A Program to Build E(N)-Equivariant Steerable CNNs
We propose a general method to implement equivariant convolutional neural networks and demonstrate it for 3D equivariant tasks. The implementation is based on the Wigner-Eckart theorem for steerable kernels.
Gabriele Cesa
,
Leon Lang
,
Maurice Weiler
Cite
Reviews + Paper
Code
A Wigner-Eckart Theorem for Group Equivariant Convolution Kernels
We generalize the famous Wigner-Eckart theorem from quantum mechanics in order to characterize steerable kernel spaces in representation theoretic terms.
Leon Lang
,
Maurice Weiler
Cite
arXiv
Reviews
Video
Slides
Poster
Cite
×