Everything you care about in one place

Follow feeds: blogs, news, RSS and more. An effortless way to read and digest content of your choice.

Get Feeder

alignmentforum.org

AI Alignment Forum

Get the latest updates from AI Alignment Forum directly as they happen.

Follow now 18 followers

Latest posts

Last updated 2 days ago

Slow corporations as an intuition pump for AI R&D automation

2 days ago

Published on May 9, 2025 2:49 PM GMTHow much should we expect...

Video & transcript: Challenges for Safe & Beneficial Brain-Like AGI

2 days ago

Published on May 8, 2025 9:11 PM GMT(Lightly edited) transcript starts here(FYI...

Misalignment and Strategic Underperformance: An Analysis of Sandbagging and Exploration Hacking

3 days ago

Published on May 8, 2025 7:06 PM GMTIn the future, we will...

An alignment safety case sketch based on debate

3 days ago

Published on May 8, 2025 3:02 PM GMTThis post presents a mildly...

UK AISI’s Alignment Team: Research Agenda

4 days ago

Published on May 7, 2025 4:33 PM GMTThe UK’s AI Security Institute...

The Sweet Lesson: AI Safety Should Scale With Compute

6 days ago

Published on May 5, 2025 7:03 PM GMTA corollary of Sutton's Bitter...

Interpretability Will Not Reliably Find Deceptive AI

7 days ago

Published on May 4, 2025 4:32 PM GMT(Disclaimer: Post written in a...

SimpleStories: A Better Synthetic Dataset and Tiny Models for Interpretability

8 days ago

Published on May 3, 2025 2:04 PM GMTThe dataset and model suite...

Interim Research Report: Mechanisms of Awareness

9 days ago

Published on May 2, 2025 8:29 PM GMTSummaryReproducing a result from recent...

What's going on with AI progress and trends? (As of 5/2025)

9 days ago

Published on May 2, 2025 7:00 PM GMTAI progress is driven by...

What is Inadequate about Bayesianism for AI Alignment: Motivating Infra-Bayesianism

9 days ago

Published on May 1, 2025 7:06 PM GMTIntroduction Infra-Bayesianism is a mathematical...

My Research Process: Understanding and Cultivating Research Taste

9 days ago

Published on May 1, 2025 11:08 PM GMTThis is post 3 of...