Leon Lang
Leon Lang
Home
Publications
Blog
Contact
Light
Dark
Automatic
1
When Your AIs Deceive You: Challenges of Partial Observability in Reinforcement Learning from Human Feedback
We theoretically and empirically study safety issues of using RLHF with human evaluators that have limited information
Leon Lang
,
Davis Foote
,
Stuart Russell
,
Anca Dragan
,
Erik Jenner
,
Scott Emmons
Cite
arXiv
Reviews
Video
Blogpost
Podcast
TAIS 2024
Poster
Evaluating Shutdown Avoidance of Language Models in Textual Scenarios
We analyze in textual scenarios whether language models show the instrumental reasoning to avoid shutdown
Teun van der Weij
,
Simon Lermen
,
Leon Lang
Last updated on Jul 3, 2023
Cite
arXiv
A Program to Build E(N)-Equivariant Steerable CNNs
We propose a general method to implement equivariant convolutional neural networks and demonstrate it for 3D equivariant tasks. The implementation is based on the Wigner-Eckart theorem for steerable kernels.
Gabriele Cesa
,
Leon Lang
,
Maurice Weiler
Cite
Reviews + Paper
Code
A Wigner-Eckart Theorem for Group Equivariant Convolution Kernels
We generalize the famous Wigner-Eckart theorem from quantum mechanics in order to characterize steerable kernel spaces in representation theoretic terms.
Leon Lang
,
Maurice Weiler
Cite
arXiv
Reviews
Video
Slides
Poster
Learning to Request Guidance in Emergent Communication
We analyze the training behaviour of an agent that can ask for help. Doing this is costly, and so the agent learns to become more independent in familiar situations.
Benjamin Kolb
,
Leon Lang
,
Henning Bartsch
,
Arwin Gansekoele
,
Raymond Koopmanschap
,
Leonardo Romor
,
David Speck
,
Mathijs Mul
,
Elia Bruni
Last updated on Feb 27, 2022
Cite
arXiv
Proceedings
Cite
×