If you have an employee who codes 2x faster than everyone else but produces 10x the bugs, would your suggestion to be to let him rip and stop reviewing his code output?
> I never suggested letting agents code for a day on end. I use AI to code well defined tasks and treat it like a mid level ticket taker
It doesn’t matter how long you’re letting it run. If you aren’t reviewing the output, you have no way of knowing when it changes untested behavior.
I regularly find Claude doing insane things that I never would have thought to test against, that would have made it into prod if I hadn’t renewed the code.
> It doesn’t matter how long you’re letting it run. If you aren’t reviewing the output, you have no way of knowing when it changes untested behavior.
You’re focused on the output , I’m focused on the behavior. Thats the difference. Just like when I delegate a task to either another developer or another company like the random Salesforce integration or even a third party API I need to integrate with.
Unfortunately you are not equipped to observe and test all or even most of the behavior of a non-trivial system.
And if you attempt to treat every module in your system like it’s untrusted 3rd party code you’ll run into severe complexity and size limits. No one codes large systems like that because it’s not possible. There are always escape hatches and entanglements.
I am not saying they treat every other team as a third party. I am saying they treat the code itself as a black box with well defined interfaces. They aren’t reaching in to another services data store to retrieve information.
Then you have know idea how Amazon works. I was there from 2020-2023.
And before that I worked at 4 product companies when new to the company managers /directors/CTOs needed someone to bring in best practices and build and teach teams.
I never suggested letting agents code for a day on end. I use AI to code well defined tasks and treat it like a mid level ticket taker