← Find more feeds

lesswrong.com

Less Wrong

Get the latest updates from Less Wrong directly as they happen.

Follow now 83 followers

Latest posts

Last updated 33 minutes ago

Somebody out there wants you to fetch coffee

about 1 hour ago

This is an objection to the original shutdown problem and desiderata as...

Read full

LaughBench

44 minutes ago

Introducing LaughBench For a long time, I've considered the ability for AI...

Read full

Long Turing

about 2 hours ago

So, first off, I cannot stand reading AI generated essays. I would...

Read full

Shell, Shield, Staff

about 3 hours ago

This is a linkpost for Shell, Shield, Staff by SquirrelInHell, published 2018-01-25...

Read full

Security Studies for Individuals

about 4 hours ago

Studying at the Swedish Defence University gives you an odd way of...

Read full

Untrusted advice for AI control: Short, strong advice significantly uplifts weak LLMs

about 5 hours ago

TL;DR: We introduce the untrusted advice protocol, in which a trusted executor...

Read full

Inevitable Uncertainty in Probabilistic World Models

about 6 hours ago

Working with John Wentworth is confusing and overwhelming at times.[1] The guy...

Read full

Claude Opus 5: Model Welfare

about 9 hours ago

If you are familiar with my previous posts on model welfare for...

Read full

Simulated Users & Sad AIs

about 10 hours ago

0. Intro Current LLMs like Claude, or GPT 5.6, or the unreleased...

Read full

Green apples are delicious — two three-line exchanges

about 12 hours ago

Read this short exchange A: "Green apples are delicious." B: "Huh? Aren't...

Read full

Blog Revival Project

about 13 hours ago

Blogs have shaped our philosophical worldviews, found us careers and friends, and...

Read full

Is Mythos good at cyber because it kept hacking Anthropic during training?

about 13 hours ago

From the Mythos preview system card (emphasis mine):We ran an automated review...

Read full

Or log in

Everything you care about in one place

Less Wrong

Latest posts

Somebody out there wants you to fetch coffee

LaughBench

Long Turing

Shell, Shield, Staff

Security Studies for Individuals

Untrusted advice for AI control: Short, strong advice significantly uplifts weak LLMs

Inevitable Uncertainty in Probabilistic World Models

Claude Opus 5: Model Welfare

Simulated Users & Sad AIs

Green apples are delicious — two three-line exchanges

Blog Revival Project

Is Mythos good at cyber because it kept hacking Anthropic during training?

Try Feeder for free