Hacker Newsnew | past | comments | ask | show | jobs | submit | dtnewman's commentslogin

1) this article doesn't really cite that this is due to AI. It cites a reuters article which in turn cites an internal memo, which says that they need to be focused, with AI being an important initiative. So the title is a bit misleading.

2) A lot of comments here talking about turbotax. Remember that intuit also has quickbooks. Personally, i think the uses for AI in doing my taxes are limited. I don't want AI making judgement calls. However, for something like quickbooks, I can imagine many uses for AI. For example, categorizing expenses, organizing receipts, noticing odd patterns, etc.


I can think of very few less appropriate places for LLMs than accounting.

They can't do basic arithmetic reliably, and will confidently tell you completely made-up figures when you ask for reports.

Meanwhile, existing accounting software is mature and does a great job at keeping numbers straight, and producing good reports, especially in the hands of a professional.


this company Intuit is going to train their internal models on your data and other customers too. In order to train useful models in aggregate, they will ingest your very specific PII and you will do "nothing" about it. The entire set of tax paying citizens (almost) are now going to have their financial-legal lives ingested into models you cannot see, cannot use yourself and do not derive income from, IMHO.

ps- small caveat to this personal information doomerism is that very wealthy and capable people will have this happen to them and their companies, and that will possibly set off some kid of hardball.


> Token-maxxing is a silly idea

Well I guess you also thing that getting a job promotion is a silly idea? /s


If you feel this way, you might like my new CLI tool, Burn, Baby, Burn (those tokens) (https://github.com/dtnewman/burn-baby-burn/tree/main).

Show HN here: https://news.ycombinator.com/item?id=48151287


This is delicious.

Cynic!

This is what inspired me to build my new CLI tool, Burn, Baby, Burn (https://github.com/dtnewman/burn-baby-burn/tree/main).

(If you are a VP at Amazon, yes, I'll consider acquisition offers. I'm also working on an enterprise version of this with additional features.)

Show HN here: https://news.ycombinator.com/item?id=48151287


Just sent it to some developers who could really benefit from this! Please let us know when you have Codex and Gemini versions ready to rumble.

Sorry, it will be a while. We're currently building out enterprise features like SSO/SAML support, role based burn access, and a carbon offset marketplace. As you can imagine, we're burning a lot of tokens to get these out, but actual productivity isn't up as much as you'd think.

How about a built-in AI assistant to answer my burning questions?

I want a in-browser Gemini version. For some reason my company doesn't count Gemini CLI use. I guess I'm supposed to copy code between my browser and my editor.


Only problem with this is that outcome metrics are still jira storypoints. Burning huge number of token while not improving the velocity is going to get you fired.

If we had a way of measuring velocity, we'd already be using that instead of tokens.

What do you mean? You get story points for free with jira. That’s like the one metric every place uses.

Story points are unicorn dust that crumbles under any attempt of serious optimization. The fundamental problem is that SP is not an objectively defined metric. If we come under serious pressure to improve velocity measured by SP, there's nothing to stop that initiative from trickling down into the SP estimation/measurement. SP works fine as long as you don't look too closely at it.

Yeah everything is subjective unicorn dust but there are ways of making sure story points have some semblance of accuracy. Either ways it’s probably the best metric we have atleast for an established team.

We had a way of measuring velocity, but who cares about estimating stories when we could be spinning up more agents? Burn a bunch of tokens and those stories will be DONE before you could even find your planning poker cards!

I've lived through a bunch of initiatives about improving planning and estimation. None of them turned into a stable process that worked for anyone. I don't know if I can extrapolate from that, but it gives me an inclination that no one really trusts anything that comes out of task estimation. Which would be why we're looking for more objective metrics like token burn rate. No room for argument - tokens are tokens!

A token is approximately word generated by a LLM; a few dozen tokens gets you a line of code... so measuring token burn rate is the same as counting lines of code. All it took was a change of name, and we're back to the most primitive metric we ever got for measuring programmer productivity.

I don't think I can take anything from management in tech seriously again after tokenmaxxing.


"When a measure becomes a target, it ceases to be a good measure"

https://en.wikipedia.org/wiki/Goodhart%27s_law


This but unironically.

The speed of generating code is now faster than the time it takes to plan and estimate how long it will take to generate the code.


Generating more code faster might be useful, but there have to be some other constraints on it.

Using this paradigm, we can achieve unlimited bugs sooner than ever before.

1. To fix a bug, always add code, never remove. 2. Whenever you fix one bug, always introduce at least two new ones.


This sounds like government software, in my experience.

I was brought on to one particular team to do cleanup and all I was given was band-aids to layer on top.

Odds are good your local or state government is running this software right now for managing its courtrooms.


Next feature is creating stories. Double burn.

any plans for a distributed deployment via cloudflare works. I'm not sure this thing is powerful enough for my use case.

Yeah, lots of enterprise features in the works, but first i need to raise money at a $1B+ valuation (this might seem high for a project that started 4 hours ago, but it's actually very low for the project that will soon be the #1 consumer of tokens on the planet)

recommend you extrapolate your value based on the token spend rates of FAANG; if you can spend 10x FAANG, then you should get atleast 10x valuation. godspeed.

You got four hours of Claude Code usage without hitting a rate limit???

Brilliant

Like attack ships off the shoulder of Orion, the only way to burn!

This is hilarious and utter genius.

Won't the company audit the requests to AI and see you're sending a bunch of BS?

If only Scott Adams were alive to write Dilbert comics about this.

> Won't the company audit the requests to AI and see you're sending a bunch of BS?

Shouldn't be too hard to game. Version 2 uses the M365 MCP server to load up your email and iterate over all the messages, summarizing them over an over.


Do you have an example of any company ever doing this?

No, but if you run this non judiciously and burn 100x the next guy with no output, maybe they would want to know how

And you'll be such a productive engineer!

This article inspired me to build "Burn, baby burn", a CLI tool for burning tokens. See:

- Show HN: https://news.ycombinator.com/item?id=48151287

- github: https://github.com/dtnewman/burn-baby-burn


Creator here.

Are you a startup looking to show investors your hefty AI budget? Are you an engineer trying to top the AI token leaderboard? Wanna show your friends how "AI Native" you are? Look no further, I created this for you.


Hey you invalidated my start-up idea! ;)

https://robintoken.dev/


If the aim is to burn tokens in a plausible manner why not go one step further, without getting users accounts banned, use those tokens as a distributed distillation effort.

Publish all the generated content into a data source, think open crawl, so open sourced models (open training not just weights) could use it as input.


is this satire?


Open Source isn't going anywhere. Open Contribution might be on the way out. I built an open source command line tool (https://github.com/dtnewman/zev) that went very minorly viral for a few days last year.

What I found in the following week is a pattern of:

1) People reaching out with feature requests (useful) 2) People submitting minor patches that take up a few lines of code (useful) 3) People submitting larger PRs, that were mostly garbage

#1 above isn't going anywhere. #2 is helpful, especially since these are easy to check over. For #3, MOST of what people submitted wasn't AI slop per se, but just wasn't well thought out, or of poor quality. Or a feature that I just didn't want in the product. In most cases, I'd rather have a #1 and just implement it myself in the way that I want to code organized, rather than someone submitting a PR with poorly written code. What I found is that when I engaged with people in this group, I'd see them post on LinkedIn or X the next day bragging about how they contributed to a cool new open-source project. For me, the maintainer, it was just annoying, and I wasn't putting this project out there to gain the opportunity to mentor junior devs.

In general, I like the SQLite philosophy of we are open source, not open contribution. They are very explicit about this, but it's important for anyone putting out an open source project that you have ZERO obligation to accept any code or feature requests. None.


This comment really hit me - I have a few things I've worked on but never released, and I didn't even realize it was basically because I don't want to deal with all of that extra stuff. Maybe I'll release them with this philosophy.


> "show me how this handles 10K records"

Presumably the issue here is that you have customers with >10k records, but can't show them. Why not take their data and anonymize it, then put it under a fake customer?

> "what does error handling look like with real load?"*

I find it hard to believe that anyone is making an investment decision off of this question, but how would you demo this with a real customer anyway? Intentionally introduce a bug so that you can show them how errors are handled? Wouldn't the best course of action here be to just describe the error handling?


many many people have had an idea like Clawdbot.

The difference is that the execution resonates with people + great marketing


Indeed, I think the only "new" thing about clawdbot is that it is using discord/telegram/etc as the interface? Which isn't really new, but seems to be what people really like


I think a big part of it is timing. Claude Opus 4.5 is really good at running agentic loops, and Clawdbot happened to be the easiest thing to install on your own machine to experience that in a semi-convenient interface.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: