Rendered at 14:08:44 GMT+0000 (Coordinated Universal Time) with Cloudflare Workers.
jpease 1 days ago [-]
“Ray became VP of Sales at a $2B company when he was 19”
I guess that’s OK, but I was skateboarding at 19.
Can you even kick flip?
Fizzadar 1 days ago [-]
Somewhat ironic that the site is completely broken on mobile, the text doesn’t render until you scroll near past it. Production code eh?
Darmani 1 days ago [-]
Hi Fizzadar,
On the one hand, it is true that the website code was pushed 0 minutes before this announcement went up.
On the other hand, I tested just now on two different phones and didn't see any issues. Can you say in more detail what you expected vs. what actually happened?
There was an occlusion issue on some smaller screens, but it's been fixed now.
poetril 21 hours ago [-]
I do think this style of working is where software engineering work is heading. This style is essentially exactly what my workflow is today using _insert agent harness here_ + plannotator[0]. Linear also recently rolled out something very similar for reviews[1]. The working style of spec driven dev, following by fast code reviews using tools like plannotator/linear/command center seems to be where we are headed, and more more tools like it are popping up nowadays.
How can I have any confidence in the security of your product?
It's extremely hard to convince myself to use a product for the huge variety of often sensitive agent tasks when it's not open source. I understand the business reasons for that, but it's unusual in this space at the moment.
Instead: Can you post any independent security assessments perhaps? Fundamental things like SOC2?
Darmani 1 days ago [-]
Hi egamirorrim,
The basic answer is that it runs locally. If you turn telemetry off and don't use our free Gemini credits, it's trivial to verify that no traffic goes to our servers other than a tiny subscription check. For our enterprise customers, we offer a version that doesn't even do that. Everything stays between you and your model providers (and we support custom and local models).
SOC2 is still a work in progress. I'm a former security researcher with work featured in the New York Times, and I know that doing it right (and not going through Delve) takes time. I can tell you that we have passed a compliance check for a company in a highly-regulated space.
I didn't find your contact info, but I'm available at jimmy@cc.dev, and happy to discuss your needs.
sltr 2 days ago [-]
I'm Doug, quoted above. I took Jimmy's excellent course, and when I learned about Command Center, I subbed immediately. I wasn't disappointed. It's a bit like turning your LLM into a graduate of that course.
pooploop64 2 days ago [-]
Not trying to accuse anyone of anything but this sounds exactly like one of those scam courses that turns out to be a pyramid scheme centered around selling the course to other people.
Darmani 2 days ago [-]
We have a referrer program
Doug has not signed up for it.
sltr 1 days ago [-]
Sorry about the infelicitious timbre. Nope, I'm just a happy customer.
eltonlin 2 days ago [-]
Code walkthroughs are underrated
mklifelife 1 days ago [-]
It's interesting how AI is making software development dramatically faster, but quality is becoming an even bigger differentiator. Building is no longer the bottleneck for many founders. Knowing what to build and maintaining quality are becoming more important.
Zyros111 23 hours ago [-]
Seems like an interesting idea, any tips for how a junior dev can get the most out of using the app with the goal of programming skills growth?
Darmani 10 hours ago [-]
There's a few obvious suggestions -- discuss design tradeoffs with the AI extensively, make sure you understand all your code, and understand the refactorings.
I think using Command Center can lead to much better skills growth than any other agentic coding environment. Which is a lot like saying that a shrubbery is much taller than grass, when you need your skills to grow into a tree.
This is a problem that no-one has solved. I think we might have it solved by the end of the year (we've been doing deals with a lot of universities and are being pulled in that direction),
The things humans can offer over AIs: product sense, greater context, taste. Of these, taste (really: software architecture and design) is the one that's most fundamentally about software engineering skill. I wrote recently about this at https://self-service.mirdin.com/software-design-in-the-age-o...
The big problem: it's very hard to develop enough taste to be a general without actually being in the trenches. This goes for pretty much any field, including literal war.
I've trained about 500 software engineers. But all of them were working professionals, who would take the training back on the job each week and see all the lessons playing out in their own and their coworker's code. If they were just chatting with AI and never having to get halfway through a big feature only to realize that the design was just fundamentally flawed, rate of growth would probably be much slower.
In short: lots you can do to grow faster than someone who just stares at the Claude CLI all day and never opens an editor. But how to become actually good while still doing AI coding? Unsolved problem.
billehunt 1 days ago [-]
Command Center is really cool. I worked with Jimmy at Thiel Fellowship - wicked smart guy.
yegemberdin 2 days ago [-]
How do you guys ensure that the refactoring improves the existing code?
i_eat_rocks 1 days ago [-]
The answer to "how do you ensure refactoring improves code?" is embedded in the binary as a system prompt. It's his own blog post about the Embedded Design Principle.
The binary contains 9 system prompts, all instruction templates for the LLM. None contain any code for measuring code quality (unfortunately)
The pipeline is three steps:
suggest-data-unifications - prompts the LLM with the blog post. The prompt starts literally with "For each data structure in the specified code, do the following."
suggest-code-unifications - same agent, different prompt. Starts with "Now look at the file and apply the above guidelines."
execute-refactoring - runs the LLM's suggestions through a coding agent.
No verification between steps. No quality gate. No baseline comparison.
The refactoring agent's entire context is the blog post, literally. Read it. Find duplication. Merge it.
The closest thing to a "guardrail" is a function which calls eval() on arbitrary user-defined JavaScript. And AutoAcceptDecorator which intercepts LLM messages matching /proceed|go ahead|make|implement|apply/ and auto-replies "Yes, please proceed with the changes."
So when you ask "how do you ensure it improves code?" the answer is: we ask an LLM to read a blog post about code quality and then we trust it. And we built a regex that auto-accepts its own changes.
The binary also has a separate class for fiber-based refactoring execution, and a full walkthrough generation pipeline that auto-generates code walkthroughs from git diffs. There's a separate workflow for file organization that reads Jimmy Koppel's rule ("Make the design apparent in the code") and applies section headers to changed files. Completely independent from the deduplication agent but uses the same pipeline: read prompt, LLM, apply changes.
And the DoItAll workflow chains everything together. DeDuplicate runs in parallel, then embedded-design and organize-file run on every changed file with concurrency:2. It's a full refactoring pipeline.... but every single step is just: read a blog post, LLM, apply. The entire product is two blog posts, a concurrency manager, and a regex.
Darmani 2 days ago [-]
Ooh. The answer is probably more interesting and philosophical than you expected
I can tell you that we do extensive testing, we figured out how to objectively measure the code quality on certain benchmark problems, empirically it's extremely helpful nearly all the time.
But in the general case: it is not actually possible to guarantee this.
That's because whether a change improves the code often depends on information which is literally not present in the codebase.
Some of these are more trite. E.g.: whether a comment is helpful or redundant slop depends on the audience.
A simpler example: There's a function that's never called. Should it be deleted?
There's a number of factors outside the codebase that determine the answer. Including the obvious one "Not if your next prompt is going to start using it."
foecalfork 2 days ago [-]
You found a way to objectively measure code quality?? Sell that! Why even sell this course when you have the ability to literally beat every software company?
Darmani 2 days ago [-]
In honesty, that's not a bad idea, and we hadn't thought of that.
It's pretty expensive to measure even for small programs. It's also more of a relative than an absolute measure, i.e.: it scores two variants of the same codebase, but the raw scores aren't very meaningful on their own. So our goal had been to use this in the benchmark set we're working on when we release a standalone refactoring product.
But the more I think about this suggestion, the more I think: "Hmmm, why not?"
csunoser 2 days ago [-]
Oh hey, this is the jj workshop person!
Darmani 2 days ago [-]
And indeed, I think we're the only agentic coding environment with jj support.
The most difficult code in the 1.0 release is some gymnastics to avoid the appearance of a concurrency conflict with a user running their own jj commands, made at the request of the person who introduced me to jj.
embedding-shape 1 days ago [-]
> But if you want to uphold traditional engineering discipline while also shipping 20 PRs a day, then this is the environment for you.
It seems like an interesting tool, curious about trying it out once it's been out for a while. But who in holy hell, with AI assistance or not, could possibly "ship" (merged?) 20 PRs a day and still know what they're doing?
You talk a lot about quality and making sure to avoid slop, but there is no way in heaven you can ship 20 PRs and still ship quality design/architecture/code and avoiding slop.
I'd be curious to see some of those PRs if you're saying you've essentially solved the holy paradox of "ship fast = shit code" or "ship slow = good code".
Darmani 1 days ago [-]
Most days I don't ship 20 PRs. But I think my record is 30.
Three things made that possible.
The first, obviously, is having Command Center.
The second is that a lot of those were fixes or UX improvements under 100 lines.
The third is, no joke, not sleeping. I've had quite a few 20+ hour days in the last 6 months. Some of that is work pressure, but also I've considered getting evaluated for a broken circadian rhythm.
> I'd be curious to see some of those PRs if you're saying you've essentially solved the holy paradox of "ship fast = shit code" or "ship slow = good code".
If you're serious, I'll be happy to get on a call and show you.
embedding-shape 1 days ago [-]
Thanks for explaining, I still have my doubts about the actual quality, but I'm also I'm also very open to be proven wrong! It has happened before, bound to happen again at some point or another :)
> but also I've considered getting evaluated for a broken circadian rhythm.
Heh, personally I fixed this by just adopting the sleep cycle my body wants of going to bed at 04:00/05:00 and going up at 11:00/12:00, life is much better now when I just accept it. One approach if your life can allow it :)
> If you're serious, I'll be happy to get on a call and show you.
Very much so, obviously prefer something async if possible, just a .patch file could suffice I suppose, but could do a call to have a look if that's the only way :) Reach out to my email from my profile and we can coordinate :)
Darmani 20 hours ago [-]
> Heh, personally I fixed this by just adopting the sleep cycle my body wants of going to bed at 04:00/05:00 and going up at 11:00/12:00, life is much better now when I just accept it. One approach if your life can allow it :)
That was my life in my mid-late 20's.
But as I've gotten older, my sleep schedule has only gotten more messed up. Now I consider it a victory if I manage to go to sleep before the dawn.
> Very much so, obviously prefer something async if possible, just a .patch file could suffice I suppose, but could do a call to have a look if that's the only way :) Reach out to my email from my profile and we can coordinate :)
Cool, let's chat async then. Contacting you now.
plastic041 1 days ago [-]
Header layout breaks on ipad. haha...
Darmani 1 days ago [-]
Thanks!
The final moments before this launch announcement consisted of me twiddling my thumbs while waiting for our designer to upload any version he could get ready in time that is better than the previous version of our website. So we knew we'd be launching with a lot of imperfections in the visuals. Did test in mobile, but not on iPad.
android521 1 days ago [-]
"even fairly nontechnical people and solo founders told us they were spending more than half of their development time reading the AI-written code.." ~ Is this even true? I haven't read code for at least 6 months and I have many who are in the same boat.
Darmani 1 days ago [-]
In fairness, I did most of these interviews last summer, and I know some people have changed. And while I did go a fair bit outside my network to interview people, there are all sorts of hard-to-understand selection effects that come from me being me. A 21 year-old frat boy who tried doing the same kind of interviewing with the people he could find to interview would probably get different results.
But yes, that is indeed what happened. Multiple times, I'd talk to someone that I'd expect to not be reading the code at all (solo founder, mostly nontechnical), then I'd interview him in detail about his workflow and think "Huh, there was absolutely no point in there where he was reading stuff," and then I'd ask "So how much of your time is reading code?" "60, maybe 70%"
I guess that’s OK, but I was skateboarding at 19.
Can you even kick flip?
On the one hand, it is true that the website code was pushed 0 minutes before this announcement went up.
On the other hand, I tested just now on two different phones and didn't see any issues. Can you say in more detail what you expected vs. what actually happened?
There was an occlusion issue on some smaller screens, but it's been fixed now.
0: https://plannotator.ai/
1: https://linear.app/docs/diffs
It's extremely hard to convince myself to use a product for the huge variety of often sensitive agent tasks when it's not open source. I understand the business reasons for that, but it's unusual in this space at the moment.
Instead: Can you post any independent security assessments perhaps? Fundamental things like SOC2?
The basic answer is that it runs locally. If you turn telemetry off and don't use our free Gemini credits, it's trivial to verify that no traffic goes to our servers other than a tiny subscription check. For our enterprise customers, we offer a version that doesn't even do that. Everything stays between you and your model providers (and we support custom and local models).
SOC2 is still a work in progress. I'm a former security researcher with work featured in the New York Times, and I know that doing it right (and not going through Delve) takes time. I can tell you that we have passed a compliance check for a company in a highly-regulated space.
I didn't find your contact info, but I'm available at jimmy@cc.dev, and happy to discuss your needs.
Doug has not signed up for it.
I think using Command Center can lead to much better skills growth than any other agentic coding environment. Which is a lot like saying that a shrubbery is much taller than grass, when you need your skills to grow into a tree.
This is a problem that no-one has solved. I think we might have it solved by the end of the year (we've been doing deals with a lot of universities and are being pulled in that direction),
The things humans can offer over AIs: product sense, greater context, taste. Of these, taste (really: software architecture and design) is the one that's most fundamentally about software engineering skill. I wrote recently about this at https://self-service.mirdin.com/software-design-in-the-age-o...
The big problem: it's very hard to develop enough taste to be a general without actually being in the trenches. This goes for pretty much any field, including literal war.
I've trained about 500 software engineers. But all of them were working professionals, who would take the training back on the job each week and see all the lessons playing out in their own and their coworker's code. If they were just chatting with AI and never having to get halfway through a big feature only to realize that the design was just fundamentally flawed, rate of growth would probably be much slower.
In short: lots you can do to grow faster than someone who just stares at the Claude CLI all day and never opens an editor. But how to become actually good while still doing AI coding? Unsolved problem.
I can tell you that we do extensive testing, we figured out how to objectively measure the code quality on certain benchmark problems, empirically it's extremely helpful nearly all the time.
But in the general case: it is not actually possible to guarantee this.
That's because whether a change improves the code often depends on information which is literally not present in the codebase.
Some of these are more trite. E.g.: whether a comment is helpful or redundant slop depends on the audience.
Some are deeper. E.g.: whether a piece of duplication is good or bad depends on the intent, and that is often impossible to recover from the source. https://www.pathsensitive.com/2018/01/the-design-of-software...
A simpler example: There's a function that's never called. Should it be deleted?
There's a number of factors outside the codebase that determine the answer. Including the obvious one "Not if your next prompt is going to start using it."
It's pretty expensive to measure even for small programs. It's also more of a relative than an absolute measure, i.e.: it scores two variants of the same codebase, but the raw scores aren't very meaningful on their own. So our goal had been to use this in the benchmark set we're working on when we release a standalone refactoring product.
But the more I think about this suggestion, the more I think: "Hmmm, why not?"
The most difficult code in the 1.0 release is some gymnastics to avoid the appearance of a concurrency conflict with a user running their own jj commands, made at the request of the person who introduced me to jj.
It seems like an interesting tool, curious about trying it out once it's been out for a while. But who in holy hell, with AI assistance or not, could possibly "ship" (merged?) 20 PRs a day and still know what they're doing?
You talk a lot about quality and making sure to avoid slop, but there is no way in heaven you can ship 20 PRs and still ship quality design/architecture/code and avoiding slop.
I'd be curious to see some of those PRs if you're saying you've essentially solved the holy paradox of "ship fast = shit code" or "ship slow = good code".
Three things made that possible.
The first, obviously, is having Command Center.
The second is that a lot of those were fixes or UX improvements under 100 lines.
The third is, no joke, not sleeping. I've had quite a few 20+ hour days in the last 6 months. Some of that is work pressure, but also I've considered getting evaluated for a broken circadian rhythm.
> I'd be curious to see some of those PRs if you're saying you've essentially solved the holy paradox of "ship fast = shit code" or "ship slow = good code".
If you're serious, I'll be happy to get on a call and show you.
> but also I've considered getting evaluated for a broken circadian rhythm.
Heh, personally I fixed this by just adopting the sleep cycle my body wants of going to bed at 04:00/05:00 and going up at 11:00/12:00, life is much better now when I just accept it. One approach if your life can allow it :)
> If you're serious, I'll be happy to get on a call and show you.
Very much so, obviously prefer something async if possible, just a .patch file could suffice I suppose, but could do a call to have a look if that's the only way :) Reach out to my email from my profile and we can coordinate :)
That was my life in my mid-late 20's.
But as I've gotten older, my sleep schedule has only gotten more messed up. Now I consider it a victory if I manage to go to sleep before the dawn.
> Very much so, obviously prefer something async if possible, just a .patch file could suffice I suppose, but could do a call to have a look if that's the only way :) Reach out to my email from my profile and we can coordinate :)
Cool, let's chat async then. Contacting you now.
The final moments before this launch announcement consisted of me twiddling my thumbs while waiting for our designer to upload any version he could get ready in time that is better than the previous version of our website. So we knew we'd be launching with a lot of imperfections in the visuals. Did test in mobile, but not on iPad.
But yes, that is indeed what happened. Multiple times, I'd talk to someone that I'd expect to not be reading the code at all (solo founder, mostly nontechnical), then I'd interview him in detail about his workflow and think "Huh, there was absolutely no point in there where he was reading stuff," and then I'd ask "So how much of your time is reading code?" "60, maybe 70%"