News
Iliad is Hiring " Less Wrong
1+ hour, 18+ min ago (456+ words) Iliad is hiring for operations, research, and engineering roles. If you're excited about advancing foundational AI alignment research, we'd love to hear from you. Full job descriptions are available at https: //www. iliad. ac/careers. About Iliad Iliad is a…...
The Hats of Less Online " Less Wrong
5+ hour, 24+ min ago (235+ words) It is currently the evening after day two of Less Online 2026. I wish to document one popular topic of discussion among Less Online attendees: golden h...
Coalitional Darwinism and the Instrumental Utility of Individuality " Less Wrong
1+ day, 1+ hour ago (1339+ words) This post was written as part of MATS 9. 1 under the mentorship of Richard Ngo. "...
Optimisation over non-stationary distributions creates weirder minds " Less Wrong
1+ day, 14+ hour ago (433+ words) Modern LLM post-training involves interleaving many different training objectives such as mixing different reward functions with supervised fine tuning (SFT), typically in order to guard against catastrophic forgetting. However, a lot of existing theoretical work in AI safety operates under…...
Two More Methods for Consistency Training and Some New Ways to Apply It " Less Wrong
1+ day, 17+ hour ago (213+ words) Authors: Sukrati Gautam*, Neil Shah*, Arav Dhoot*, Bryan Maruyama*, Caroline Wei*, Rohan Kapoor, Robert Sidey, Prakhar Gupta, Zi Cheng Huang, David D...
Thoughts on 'Learning Mechanics' " Less Wrong
3+ day, 22+ hour ago (388+ words) Last month, I stumbled upon an optimistic-sounding paper: "There Will Be a Scientific Theory of Deep Learning". "...
Society Explained: a tool for efficiently exploring >100 theories of society " Less Wrong
4+ day, 13+ min ago (260+ words) There are many competing theories of how society does and should function, from Karl Marx and Adam Smith to Steven Pinker and Eliezer Yudkowsky. These theories are often hard to understand - you may need to read an entire book (or…...
Taking the Training Wheels Off: Aligning LLMs without Personas " Less Wrong
5+ day, 7+ hour ago (281+ words) " If you told an AI Alignment researcher in 2018 about an alignment plan that involved collecting trajectory information of moral experts at scale...
Can LLMs even teach? Exploring the Teacher Axis " Less Wrong
5+ day, 17+ hour ago (1375+ words) TLDR As a passionate teacher, it has pained my heart to watch my students lose deeper critical thinking skills and independent reasoning. But attempt...
Dissolving the Deep Learning Sample Efficiency Gap " Less Wrong
5+ day, 19+ hour ago (612+ words) A common observation about deep learning is that it's wildly sample inefficient compared to humans. Deep learning systems appear to need much more re...