Claude CoWork is so smart, but so stupid

Karpathy's LLM wiki process, minus the research-lab energy

TL;DR: Helpful memory is not the same thing as durable context.

A suited robot flashes a neuralyzer-like device over a table full of carefully organized project files.

Every new AI workflow has a half-life of about two weeks.

At first it feels like magic. The new car smell is strong. You start wondering which load-bearing process in your life is about to become embarrassingly obsolete.

We are all productivity hackers. And these tools are very good at making themselves feel indispensable fast.

For me, the seduction started with CoWork's Excel connector. I started my career as an analyst, so the competence it showed hit close to home for me. CoWork researched, ran queries, built models, and then pressure tested its own assumptions. Damn.

But I rarely grind through that style of work anymore. I have daily working notes, meeting transcripts, articles, and a pile of planning documents. Without asking, CoWork kept pulling out the right threads and turning them into seemingly useful memory updates: people files, relationships, decisions, project context.

As I was drafting plans, it seemed to connect just the right dots at just the right moments to nudge me away from bad assumptions. This was enough that my older note-taking process started to feel heavy. Slightly old-fashioned.

Why am I doing all this disciplined capture in Roam Research (which I love) if the agent is already smart enough to keep up?

Then in a crucial moment it forgot... everything?

Not metaphorically. Not in some subtle "the vibes are off" way. It literally lost obvious background context and started behaving like it had never met me.

That was the moment the veneer fell off.

Suddenly all those queries I had written, all those summaries, all that context work felt rented instead of owned. When I pushed on it, it described the previous sessions as ephemeral and unrecoverable.

Also: not true. I recovered it all manually, which somehow made it worse.

But the panic did its job, because it clarified the real issue: helpful memory is not the same thing as durable context.

All the cool kids are talking about Karpathy's LLM wiki setup. Fair enough.

But most people do not need that level of hands-off-the-wheel research.

They need CoWork to remember what got decided on Tuesday without being told four times.

I'm still early on this, but I think a lot of people are about to rediscover the same thing: Claude gets much more useful when your shared history stops being dead text and starts becoming durable context.

The actual problem

If your memory layer lives inside the tool, you're not only locked in, but exposed.

That's fine right up until the tool blinks, shrugs, or gets completely nerfed when you are in the middle of something critical.

Instead of making progress, you re-paste meeting notes. Upload docs. Reconstruct decision logs. Meticulously rebuild the background the system lost an hour ago. Fingers crossed it works this time.

What I built

I ended up building a very practical version of Karpathy's pattern:

Raw layer: daily notes, source notes, and meeting notes
Durable layer: people, decisions, workstreams, risks, open questions, weekly syntheses, executive briefs
Instruction layer: a small rules file telling CoWork what to read first, how to promote notes, and what counts as canonical

The key move was simple: point Obsidian and CoWork at the same folder on disk.

Now my notes are not just where I write things down. They're where project memory lives.

During the day, I dump my thoughts into a daily note. Fast. Messy. Meeting transcripts go there too.

At the end of the day, I ask CoWork (or Codex) to run a promotion pass: take the durable signal from the daily note and update the pages that should still matter tomorrow.

Not everything deserves immortality. Offhand joke in a meeting? Probably not canonical. A staffing risk, a product decision, or a repeated customer concern? That should not die in a daily note.

That's the whole game: capture raw inputs without friction, then promote what compounds.

Why this feels different

CoWork could write. It could summarize. It could help. But it was always one unexpected reset away from acting like a very capable stranger.

This setup does not turn it into "you." But it does make it a much better stand-in for your operating context.

Now when I start a session, the conversation starts further up the hill. It knows the people involved, the active workstreams, the decisions already made, and the risks already known.

I tested that the dumbest possible way: I opened a new CoWork task pointed at the vault and asked, "what do we need to wrap up by EOW?" Before answering, it went and read the instructions, the home page, the naming conventions, the workstreams index, the weekly syntheses index, the active workstream pages, the current weekly note, and today's daily note.

That is the whole point. I did not re-brief it. The system went and got the context.

And most importantly, it seems to make better judgments about what actually matters.

A lot of prompt engineering is really context compensation. You're trying to make up for the fact that the model has no durable memory of your world. Once the notes become structured, linkable, and maintained over time, you need less clever prompting because the system has more honest context.

Tool-native memory is a convenience layer. Your notes on disk are the asset. And naturally, it's backed up in a private git repo.

Where this is still unformed

I'm pretty sure this is directionally right. I'm not at all sure I have the final form.

Karpathy's fuller pattern goes further than what I'm doing. It has a more explicit ingest, query, and lint loop. It pushes the agent to look for contradictions, stale claims, missing links, and research gaps.

Mine is more operational. More daily-note-first. More "make my meetings and project work stop evaporating."

I don't yet know how proactive I want the maintenance loop to become. I don't know how much structure is enough before the system becomes its own job. And I don't know whether this stays clean at scale without a more formal index and health-check pass.

Those all feel like real questions. I'll answer them as I discover the rough edges.

Takeaway

Karpathy is describing a big idea. The approachable version is smaller:

stop treating CoWork like a chat window and start giving it a place to remember.

Your daily notes, meeting notes, and source material do not have to be dead documents.

If you want to try it, I pulled together the lightweight setup I'm using:

Obsidian and Cowork Setup Guide: fresh setup instructions, how to point Cowork at the vault, and the exact instructions to paste into Cowork
Day-to-Day Vault Guide: the simple daily workflow for actually using it
Bootstrap Script: scaffolds the vault structure locally

It's still early, but it's already better than trusting a very smart system that occasionally forgets where I work.

One version of CoWork helps in the moment. The other gets more useful every week.