My ideal movie experience is to go during the day towards the end of a run when there's a fair chance the theatre will either be empty or almost empty.
But I haven't even done that for a while.
If it's going to be crowded - no thanks.
Tangentially, because my gf works in opera I've been to some productions, and real theatre - a real stage, with real people, and real stage effects - can have a presence and magic that cinema just doesn't can't touch.
So I'd much rather spend money on that now. It's physically less comfortable because of the seats, but as an experience it's so much more hands-on, hand-made, and satisfying, and creatively it leaves so much more space for atmosphere and implication.
Does anyone else see this as dystopian? Someone is unironically writing about how exhausted they are and up at night thinking about how they can be a better good-boy at prompting the LLM and reminding us how we shouldn't cope by blaming the AI or its supposed limitations (context size, etc). This is not a dig at the author. It just seems crazy that this is an unironic post. It's like we are gleefully running to the "Laughterhouse" and each reminding our smiling fellow passengers not to be annoyed at the driver if he isn't getting us there fast enough, without realizing the Slaughterhouse (yes, I am stealing the reference).
Even outside of pubs and restaurants. Six packs and cases in the grocery store are all hugely inflated. Since I like beer, I got an old freezer and built a kegerator out of it and now buy my beer by the keg. (For now) keg prices are barely reasonable. $10 for a glass of beer at a restaurant?? Fuck right off with that.
I'm not explicitly authorised to speak about this stuff by my employer but I think there's value in some observations that go beyond "It's good for me" so here's a relatively unfiltered take of what I've seen so far.
Internally, we have a closed beta for what is basically a hosted Claude Code harness. It's ideal for scheduled jobs or async jobs that benefit from large amounts of context.
At a glance, it seems similar to Uber's Minion concept, although we weren't aware of that until recently. I think a lot of people have converged on the same thing.
Having scheduled roundups of things (what did I post in Slack? what did I PR in Github etc) is a nice quality of life improvement. I also have some daily tasks like "Find a subtle cloud spend that would otherwise go unnoticed", "Investigate an unresolved hotfix from one repo and provide the backstory" and "Find a CI pipeline that has been failing 10 times in a row and suggest a fix"
I work in the platform space so your mileage may vary of course. More interesting to me are the second order effects beyond my own experience:
- Hints of engineering-adjacent roles (ie; technical support) who are now empowered to try and generate large PRs implementing unscoped/ill-defined new internal services because they don't have any background to know is "good" or "bad". These sorts of types have always existed as you get people on the edge of technical-adjacent roles who aspire to become fully fledged developers without an internal support mechanism but now the barrier is a little lower.
- PR review fatigue: As a Platform Engineer, I already get tagged on acres of PRs but the velocity of PRs has increased so my inbox is still flooded with merged PRs, not that it was ever a good signal anyway.
- First hints of technical folk who progressed off the tools who might now be encouraged to fix those long standing issues that are simple in their mind but reality has shifted around a lot since. Generally LLMs are pretty good at surfacing this once they check how things are in reality but LLMs don't "know" what your mental model is when you frame a question
- Coworkers defaulting to asking LLMs about niche queries instead of asking others. There are a few queries I've seen where the answer from an LLM is fine but it lacks the historical part that makes many things make sense. As an example off the top of my head, websites often have subdomains not for any good present reason but just because back in the day, you could only have like 6 XHR connections to a domain or whatever it was. LLMs probably aren't going to surface that sort of context which takes a topic from "Was this person just a complexity lover" to "Ah, they were working around the constraints at the time".
- Obviously security is a forever battle. I think we're more security minded than most but the reality is that I don't think any of this can be 100% secure as long as it has internet access in any form, even "read only".
- A temptation to churn out side quests. When I first got started, I would tend to do work after hours but I've definitely trailed off and am back to normal now. Personally I like shipping stuff compared to programming for the sake of it but even then, I think eventually you just normalise and the new "speed" starts to feel slow again
- Privileged users generating and self-merging PRs. We have one project where most everyone has force merge and because it's internal only, we've been doing that paired with automated PR reviews. It works fairly well because we discuss most changes in person before actioning them but there are now a couple historical users who have that same permission contributing from other timezones. Waking up to a changed mental model that hasn't been discussed definitely won't scale and we're going to need to lock this down.
- Signal degradation for PRs: We have a few PRs I've seen where they provide this whole post-hoc rationalisation of what the PR does and what the problem is. You go to the source input and it's someone writing something like "X isn't working? Can you fix it?". It's really hard to infer intent and capability from PR as a result. Often the changes are even quite good but that's not a reflection of the author. To be fair, the alternative might have been that internal user just giving up and never communicating that there was an issue so I can't say this is strictly a negative.
Overall, I'm not really negative or positive. There is definitely value to be found but I think there will probably be a reckoning where LLMs have temporarily given a hall pass to go faster than the support structures can keep up with.
Actually, I should probably rephase that a little: I'm mostly positive on pure inference while mostly negative on training costs and other societal impacts. I don't believe we'll get to everyone running Gas Town/The Wasteland nor do I think we should aspire to. I like iterating with an agent back and forth locally and I think just heavily automating stuff with no oversight is bound to fail, in the same way that large corporations get bloated and collapse under their own weight.
> Since my main goal was to learn, I decided to do it "the right way". This means I didn’t want to rely on Replit or Lovable where the infra part is obfuscated. I wanted to deal with that complexity myself.
I expected OP to actually 'learn' devops, but what they did was just asking LLMs to do everything.
Also...
> 180+ paid $2 for a dino
People pays $2 for an image of dinosaur with human face?
MCP is dead? Which cli tool should we use to instruct Chrome to open a page and click the Open button? And to read what appears in the console after clicking?
MCP permanently sacrifice a chunk of the context window? And a skill for you cli is free?
> it is akin to putting lipstick on a pig. It helps, but not much.
The lipstick helps? This had me in stitches. Sorry for the non-additive reply. This is the funniest way I have seen this or any other phrase explained. By far. Honestly has made my day and set me up for the whole week.
Not a FAANG engineer but also working at a pretty large company and I want to say you're spot on 1000%. It's insane how many "commenters" come out of the woodwork to tell you you're doing x or y wrong. They may not even frame it that way, but use a veneer of questions "what is your process like? Have you tried this product, etc." as a subtle way of completely dismissing your shared experience.
Agree on the complementary layers thing. You can scan the code and lock down the keys, but there's still the question of what actually happens at runtime when a tool call goes out the door.
I've been building a proxy that sits between the agent and the tool/API it calls. Every outbound request goes through a deterministic pipeline. DNS resolution to block SSRF against private IPs, pattern matching on outbound params to catch credential leakage, path traversal checks, that kind of thing. No LLM in that path, just rules, so it adds maybe 2-3ms.
The way I think about it: you scan to catch problems before deployment, you scope the keys to limit the blast radius, and the proxy catches the stuff that slips through at call time. Three layers, none of them redundant.
“Every institution perishes by an excess of its own first principle.” - Lord Acton
For the reasons explored in the post, I prefer my type systems optional. It has been my experience and observation that typing in languages follow an 90:10 rule. You get 90% of the gains for 10% of the effort. And the remaining 10% for 9x the effort.
I’m rather happy with the way Python allows me to do the easy ones and/or pick the hotspots.
I’ve worked in exhaustively typed languages, and while it gives a sense of safety, it’s literally exhausting. And, anecdotally, I feel I dealt with more zero divides, subscript woops, and other runtime validation errors than I had in less typed languages.
Not that it matters. Soon, we’ll use semantically vague human language, to prompt, cajole, nudge LLM agents to produce programs that are (lossy) statistical approximations of what might be correct programs.
I think wmf's comment in this thread was absolutely correct and succinct, so I won't repeat, but I think it's worth noting that many (all?) of the Wayland devs were actually Xorg devs. Make of that what you will.
> the initial numbers are useless and are little more than throwing opinionated darts
You’re still concluding from ignorance. They are not. A better question would be ask to whom they’re useful and how.
Like, if a fire is burning in a neighborhood, every sighting is valuable. You don’t always need to wait for a comprehensive picture before being able to do anything.
> I judge them as early and either inaccurate and useless or politically motivated to push markets while there's no meaningful data to contradict them
That’s wrong. But it seems to be a common strain of confident error.
Maybe the solution is to make these numbers available only to academics, large corporations and Wall Street at a high price. If the public can’t handle them, and frankly, they aren’t super useful to laymen, gatekeeping the data could be the answer.
I am a developer turned (reluctantly) into management. I still keep my hands in code and work w team on a handful of projects. We use GitHub copilot on a daily basis and it has become a great tool that has improved our speed and quality. I have 20+ years experience and see it as just another tool in the toolbox. Maybe I’m naive but I don’t feel threatened by it.
At least at my company the problem is the business hasn’t caught up. We can code faster but our stakeholders can’t decide what they want us to build faster. Or test faster or grasp new modalities llms make possible.
That’s where I want to go next: not just speeding up and increasing code quality but improving business analytics and reducing the amount of meetings I have to be in to get business problems understood and solved.
I totally get how much more convenient home viewing is, but there’s something about going out and watching something in a group that is special, like we do with opera or theater, sporting events or concerts.
Meanwhile the LP crowd was flipping sides like it was Ultima VIII (slight exaggeration). Why would it be critical for a new format to do away with multi-disc releases if the customer base has already grown accustomed to them?
Anecdotes __are__ data. How much weight you ascribe to it as being representative is different. But you cannot disqualify it as 'not data'. It is usually a leading indicator of what could potentially show up in these more robust datasets.
They might say that your job is to make the product "better", and they might even think they mean it, but I think in practice you'll find that their definition of "better" as it relates to products is pretty closely related to money, and further that they are the authorities on what makes the product "better" so you should shut up and do what they say. If you want to make the product actually better, you're going to have to defy them occasionally. That's not what you were hired for, that's just being a human with principles.
Diversifying away from NASDAQ-tracking index as a component of my investments will be extremely tax costly. Maybe more costly than the gavage (as the NASDAQ/SpaceX folks seem to be betting).
And most people won't even be informed that this is happening.
Large markets need to be run in the public interest...
In other "incorrect calendars" bugs, there's the Rockchip RK808 RTC, where the engineers thought that November had 31 days, needing a Linux kernel patch to this day that translates between Gregorian and Rockchip calendars (which are gradually diverging over time).
The sound was always tinny and in mono from the small speaker you hooked on the window, but it was fun and very cheap.