More

thomasahle · 2026-06-08T05:32:09 1780896729

> The rate of fundamental, broad-based breakthroughs lifting all LLM applications has clearly slowed with many of the most impactful recent discoveries being in scaling, optimization, tuning and productization toward specific domains.

To me it definitely feels like it's still accelerating, with the most impactful recent discovery being RL training reasoning models (late '24, early '25).

There's an interesting article called "sigmoids won't save you" https://www.astralcodexten.com/p/the-sigmoids-wont-save-you which argues that (unless you have privileged information) you should always assume a process will continue about as long as it’s continued already. (Lindy's Law)

With that in mind the current disruption should last another 10-15 years (assuming it started in '10 or '17.)

thomasahle · 2026-05-16T12:23:43 1778934223

He used 600B tokens in 30 days.

I use more than 150B/month with just 15 codex accounts.

60 accounts is "just" $12,000/month. So Peter could "save" 100x by using monthly accounts.

Of course, he doesn't have to, as he works at OpenAI now.

MadxX79 · 2026-05-16T12:26:17 1778934377

Sounds like a healthy industry, selling tokens at 1000x below cost.

wolttam · 2026-05-16T13:07:56 1778936876

API pricing isn’t cost, we don’t know what cost is.

impulser_ · 2026-05-16T13:31:01 1778938261

I would bet money Anthropic and OpenAI are actually profitable on inference. The problem is they have to spend large sums of money to train models that are essentially worthless after a few months.

AussieWog93 · 2026-05-16T23:47:27 1778975247

Dario explicitly stated this in an interview.

They make more money from inference than they do training the model, but then the next model gets so much more expensive to train so their annual figures have been in the red.

MadxX79 · 2026-05-17T06:23:57 1778999037

So, it's like if they were a pharma company that was barely profitable if you didn't take into account R&D costs?

Leynos · 2026-05-16T21:28:55 1778966935

A large part of the GPT-5.x model iteration has been about making training more affordable and token efficient.

SecretDreams · 2026-05-16T12:35:36 1778934936

It's to build a moat, of course!

Narrator: there was no moat

simianwords · 2026-05-16T14:27:46 1778941666

This performative concern over token costs and subsidisation comes from either ignorance or some latent ideology signalling.

xantronix · 2026-05-16T17:08:18 1778951298

One could say "that's a great point, we should take more direct ideological action to address this issue!", but expounding upon the finer details would likely get one banned here.

peteforde · 2026-05-16T13:04:05 1778936645

What I truly don't understand, as a daily heavy Opus 4.7 user, is how you can coherently prompt 15 different parallel conversations at the same time.

For me it's not even a "what the hell are you working on" so much as complete inability to understand how you can keep so many different processes working on distinct tasks. It simply doesn't map on to how I use these tools.

I spend most of my day writing extremely detailed prompts and that's how I'm able to get the sort of excellent results that confound skeptics. But I have to be honest with you: I don't think I can write (or think) fast enough to do two of these at a time, much less 15.

I definitely could not review what they are generating with any degree of confidence.

I'm really hoping you can explain what the heck your usage pattern actually looks like, because reading this makes me feel like I'm missing something.

thomasahle · 2026-05-16T13:13:58 1778937238

I'm trying to recreate all the commercial EDA stack in open source. (RTL simulators, synthesis, formal proof tools, etc.)

Building compilers has a _lot_ of parallel tasks agents can work on.

Wish me luck..

narmiouh · 2026-05-16T13:22:47 1778937767

Good luck!

IshKebab · 2026-05-16T14:42:24 1778942544

Yeah good luck with that. I find SystemVerilog is probably the thing that AI is worst at, presumably because there's not that much training data out there, and pretty much everything about the commercial tools is paywalled.

stikit · 2026-05-16T14:37:32 1778942252

those costs are not just tokens used for prompting . costs include agent loops, etc

ianm218 · 2026-05-16T12:28:03 1778934483

What do you do with all those accounts?

arkadiytehgraet · 2026-05-16T14:17:16 1778941036

Probably trying to fix their broken personal website with the half of the links there not working at all.

thomasahle · 2026-05-16T20:39:08 1778963948

Is my website broken?

arkadiytehgraet · 2026-05-16T23:53:35 1778975615

Ask your 15 codex accounts agents, surely they will help you with that.

ianm218 · 2026-05-18T14:04:33 1779113073

Your website seemed fine to me I didn't try every link though

thomasahle · 2026-05-09T06:34:42 1778308482

I'm currently choosing between the right formalization for a big hardware project.

I'm considering between SVA, TLA+ and Lean. With the former being more domain specific and the later more general.

Do you think we'll move towards "Lean for everything" or do domain specific formalisms still make sense?

kown7 · 2026-05-09T11:33:36 1778326416

Have you considered P? It feels like a good abstraction for engineers as it's "proper" code.

https://github.com/p-org/P

NooneAtAll3 · 2026-05-09T08:36:55 1778315815

what's SVA?

IshKebab · 2026-05-09T08:52:34 1778316754

SystemVerilog Assertions. Hardware (silicon ASICs, and also FPGAs often) are written in a language called SystemVerilog. It has a feature called "concurrent assertions" which is usually just called SVA.

These are sort of temporal regexes, e.g. you can write

  assert property($fell(rst) |-> foo == 1 ##[1:20] foo == 0)

Which means if the rst signal fell (changed to 0) then foo must be 1 and 1-20 cycles later it must be 0.

The nice thing about them is that there are a few commercial tools that can formally verify them. They're super expensive (~$100k/year for one license), but fairly widely used because they work really well.

It's probably the most successful application of formal verification because it doesn't require much expertise to use. Unlike software formal verification which pretty much immediately requires you to become an expert on loop invariants, termination measures, hoare triples etc. At least that has been my experience.

thomasahle · 2026-05-05T20:51:32 1778014292

The human savant will remember where they read it and give you credit. It might lead more people to read your work, and ultimately you make money.

The AI won't even know where the page of text it's seeing came from, and people will avoid your book as they can just ask the AI. So you make less money. (Talking about specialized technical books here.)

qarl · 2026-05-05T21:02:03 1778014923

Not necessarily.

thomasahle · 2026-04-20T21:10:40 1776719440

Does it run on Nvidia or Huawei?

thomasahle · 2026-04-20T16:54:45 1776704085

> In 1983 David DeWitt (https://en.wikipedia.org/wiki/David_DeWitt) published benchmarking results showing poor performance for Oracle databases. Larry Ellison wasn't happy with the results and it's said that he tried to have DeWitt fired.

> Given how difficult it is to fire professors when there's actual misconduct, the probability of Ellison sucessfully getting someone fired for doing legitimate research in their field was pretty much zero. It's also said that, after DeWitt's non-firing,

> Larry banned Oracle from hiring Wisconsin grads and Oracle added a term to their EULA forbidding the publication of benchmarks. Over the years, many major commercial database vendors added a license clause that made benchmarking their database illegal.

See also: https://web.archive.org/web/20160719145221/http://sqlmag.com...

thomasahle · 2026-04-19T10:11:38 1776593498

This is crazy car-centric legislation.

Now, instead of letting car owners pay for the public space they use (street parking), you are forcing anyone without a car to waste their own private space, in case somebody wants to park there.

sva_ · 2026-04-19T13:48:40 1776606520

I can't imagine that you have to let someone park on your private property anywhere.

jameshart · 2026-04-19T14:45:46 1776609946

No, that is not the point.

The subtle difference is between American parking minimums imposed on property owners - “you must reserve space on your private property for this many cars whether you own them or not” vs Japanese parking requirements imposed on car owners - “you must reserve space on some private property for your car if you want to own it”

thomasahle · 2026-04-16T14:59:06 1776351546

Or, you know, they will have improved the safe guards

poszlem · 2026-04-16T15:36:46 1776353806

Sure thing.

thomasahle · 2026-04-05T07:05:23 1775372723

Good musicians care about music theory / “first principles” as much as good writers care about language theory / grammar.

thomasahle · 2026-03-31T21:14:44 1774991684

I don't know anyway using these models everyday who think they are hitting a ceiling.

If anything there's a plateau between each model release.

torben-friis · 2026-03-31T21:40:44 1774993244

I'm seeing diminishing returns, though in fairness we have no idea yet how to integrate properly with existing good practices and principles. I suspect improvement is going to come mainly from improved took usage rather than more impressive models.