Chain of Thought Monitorability

INFO

[

]

Date: 18.07.2025

Author: Lex

Categories: Our research

LINK

Twitter/X

Join the Hivemind

[

]

Thank you! Your submission has been received!

Oops! Something went wrong while submitting the form.

Good read on model safety, but doesn't feel easy when you put it side by side with Anthropic’s report on CoT faithfulness —not just because CoT monitorability is fragile, but also because efforts to make CoT more faithful didn’t really move the needle. And then there’s Coconut (continuous latent space reasoning), which doesn’t give human-readable CoT at all. Seems like some reductionist approaches—like the deeper behavioral analysis Goodfire does—are still essential

‍

Featured Mini-post

[

]

24.11.2025

Chain of Thought Monitorability

INFO

Join the Hivemind

Featured Mini-post

Some Thoughts on AI by Gavin Baker

The Open Model Landscape

dAGI Summit (October 24th in SF)

The Future of Energy Production

Cloudflare: A New Social Contract for the Web?

What OpenAI Wants and Where Things Went Wrong

Join the Hivemind. Delphi Intelligence to your inbox.