> We show that the appearance of LLM-based writing assistants has had an unprecedented impact in the scientific literature, surpassing the effect of major world events such as the Covid pandemic.
I'm not quite sure this follows. At the very least, I think they should also consider the possibility of social contagion: if some of your colleagues start using a new word in work-related writing, you usually pick that up. The spread of "delve" was certainly bootstrapped by ChatGPT, but I'm not sure that the use of LLMs is the only possible explanation for its growing popularity.
Even in the pre-ChatGPT days, it was common for a new term to come out of nowhere and then spread like a wildfire in formal writing. "Utilize" for "use", etc.
I have friends in academia (all around Europe and in different fields, from Spain to Danemark, Slovakia, Sweden, Germany, in economics, solid state physics, &c.), I can 100% guarantee you're wrong and chatgpt is now the default tool for the vast majority of people writing these papers.
It's not people picking up new words, you have to understand academia is mostly about shitting out as many papers as you can and make them as verbose as possible, it's the perfect use case for LLMs, this field was rotten before, chatgpt just makes it more obvious to the non academia crowd
For them not using chatgpt would be like sticking to sailboats while the world moved to steam engines
> academia is mostly about shitting out as many papers as you can
This is the classic case of publish-or-perish, since publication metrics are ubiquitous in all aspects of academic life unfortunately. Measuring true impact is the goal, but it is hard problem to solve for.
> and make them as verbose as possible
This is just laughably wrong. The page-limits are always too low to fit all the information one wants to include, so padding a text is just not of interest at all.
With that said, I wouldn't be surprised if people use ChatGPT a lot. If for no other reason most academics are writing in a language (English) that is not their native language and that is hard to do. Anything that makes the process of communicating ones results easier and more efficient is a good thing. Of course, it can also be used to create incomprehensible word salads, but I've seen a lot of those in the pre-LLM times as well.
Making them as verbose as possible? My experience from grad school and with friends who are now faculty is that literally everybody's first draft is above the page limit and content needs to be cut.
It sounds like you have anecdotal evidence from a bad side of academia. I also have many friends in academia across NA and Asia and have an impression closer to parent's.
> chatgpt is now the default tool for the vast majority of people writing these papers.
I'm in a similar sort of circle and I can say that this, though I've not measured rigorously, does strongly square with my own anecdotal experience, and it's especially prevalent with people whose first language is not English but have no choice but to publish (frequently) and apply for grants in English.
So many of the writing assistant tools customarily used for first-pass proofreading have gone straight into full LLM integration and no longer simply just check grammar and basic "elements of style" issues.
Also professional (human) proofreaders are fantastically expensive for the very limited amount of assistance date provide. A couple thousand dollars more (source for this number, a proofreader recommended by Oxford U Pub) for someone to fix some minor semantic redundancies in your 35 page book chapter wording is financially unreasonable for a lot of people in academia.
tangential, but I remember working in a school computer lab hours before my big paradise lost essay was due and putting an extra newline between all paragraphs to beef up the page length.
There are a few language shifts that happened in the past few years. The use of the singular "they", "there's a few" instead of "there are a few", the "like" filler word, etc. It's not that unusual.
> We will note that they has been in consistent use as a singular pronoun since the late 1300s; that the development of singular they mirrors the development of the singular you from the plural you, yet we don’t complain that singular you is ungrammatical; and that regardless of what detractors say, nearly everyone uses the singular they in casual conversation and often in formal writing.
And some of the post-LLM cliches are almost certainly purely human in their memetic spread — "moat" and "stochastic parrot" in particular come to mind.
delves
crucial
potential
these
significant
important
They're not the first to make this observation. Others have picked up that LLM's like the word "delves".
LLMs are trained on texts which contain much marketing material.
So they tend to use some marketing words when generating pseudo-academic content.
No surprise there. I'm surprised it's not worse.
What happens if you use a prompt containing "Write in a style that maximizes marketing impact"?"
('You can't always use "Free", but you can always use "New"' - from a book on copywriting.)
I use at least four of those pretty heavily, and I suspect that I'm in the top quartile in terms of using at least three of them. I guess I'm going to be queued for deactivation now.
If you only note your significant thoughts, marking them as as much is okay. That filter is what AIs lack.
On the upside, the advent of AI has made both vendor selection and project winning simpler. For the former, I can filter within seconds. For the latter, showing examples of competitors’ AI-derived speech is enough to get them eliminated.
The only one here that actually stands out from usage is "delves". Every other word on this list is common usage among anyone who has a decent vocabulary and a literary mind.
But I guess I'll just get flagged for being a GPT now.
One confounding factor here is the proliferation of autocorrect and grammar “advisors” in popular apps like gmail etc. One algorithm tweak could change a lot of writing at that scale.
While the word frequency stats are damning, there doesn’t seem to be any evidence presented that directly ties the changes to LLMs specifically.
I think they address this to some degree by checking different year pairings (end of page 2), where the only excess usage they found was of words related to current events (ebola, coronavirus, etc.) and even then not to the same degree as the 2022-24 pair.
It would be interesting to analyze how well a language model is able to predict each abstract. In theory if the text was largely written by a model then a similar model might be able to predict it more accurately than it would a human-written abstract. (Of course the variety of models and frequency at which they're updated makes this more difficult.)
> We study vocabulary changes in 14 million PubMed abstracts from 2010-2024, and show how the appearance of LLMs led to an abrupt increase in the frequency of certain style words.
"Delving".. Sounds like the authors might have used an LLM while writing this paper as well.
I think it's not an extraordinarily rare word in the US, but it does feel to me more at home in a blog post or business strategy-ish document than a PubMed abstract. People are making a big deal of it just because the signal is so strong (25x more common now than in 2022).
Whenever I see a comment that talks about programming with the uppercased "S" in "Typescript", "Javascript", "Clojurescript", my immediate guess is that the person used LLM , at least for spellchecking and improving the original text. I predict that soon, LLMs would learn to make tiny, insignificant mistakes, to sound "more human" in writing.
I still don't know whether there was an actual increase in comments that start with "You're correct". Maybe it's just me noticing it more after Chatgpt came to prominence with its subservient ways.
This just makes me think how now, more than ever, it really pays to develop your own distinctive writing voice. It’s not just a way to make yourself stand out, but given how much the way you write can influence your thought process, I worry that all the students ChatGPTing their way through language arts classes will be ultimately more susceptible to groupthink and online influence campaigns.
Developing such a voice really requires you to read other writing voices minimally lest you be influenced by other voices. I think outside of news, I would be perfectly happy reading only pre-2022 publications. I just don't know how sustainable this is.
I disagree respectfully. Before I write anything substantial, I re-read the few books that I really like to let me subconsciously absorb their elements. If I read modern journalism right before I write, whatever I write will be just like modern journalism: bland and boring. At least for me, there is a very large but temporally proximate influence between what I read and what I write.
Reading selectively makes sense, but I’d argue there can still be value in critically reading content the style of which you’d rather not emulate directly. For one, you may not always have a choice if you need to research something specific and there’s no high quality source available. You can also learn to pick out distinguishing characteristics of modern writing in order to subvert popular convention or adjust (not abandon) your style to be more accessible for the target audience.
I don’t agree with this. Quick assessment: what makes someone more susceptible to groupthink? To propaganda? Is it the way they write? That doesn’t sound right. It is not an LLM/ChatGPT-borne ailment. So I would not paint these tools as such a boogeyman.
I don't see why many people complaining on this issue. Not everyone mastered English unfortunately. I am especially very weak at writing a paper, and to be honest, find it taxing. I love research but after having results, turning it into a paper is not fun. I edit almost everything important I write like emails and papers with LLMs because even though the content is nice my writing feels very bland and lacks lots of transition. I believe many people do this and actually, this helps you learn over time. However, what you learn is to write like LLMs since basically we are supervised by the LLM.
It’s possible this is caused by the editors rather than the authors.
An old partner edited papers for a large publisher - largely written by non-native speakers and already heavily machine translated - and would almost always use ChatGPT for the first pass when extensive changes were needed.
She was paid by the word and also had a pretty intense daily minimum quota so it was practically required to get enough done to earn a liveable wage and avoid being replaced by another remote “contractor”.
This seems strange… this is an old partner, who had a job long enough to know that the only way to hit a quota to was to use ChatGPT, which came out less than two years ago?
It seems strange that something that experimental became essential that quickly.
The company is a bit predatory in that they typically employ editors from poorer countries, so her difficulty was more making enough from it than meeting the quota - but the quota was strict enough that you would very quickly lose the position if you kept missing it. I think the situation before ChatGPT was that they employed people who worked very long hours to not earn very much. Apparently turnover was quite high.
No kidding... but it did. I have an aunt who's an editor, and she was basically put out of business by chatgpt. She still gets paid by the same people for the same work, but her rate has been slashed, and since she has no interest in using chatgpt, she'll never make it up in volume.
I am an associate editor of a journal and I have suggested that the journal strongly encourage authors to pass their papers through an LLM before submitting if they think it might need it. A large fraction of the submissions we receive have terrible grammar and unnatural phrasing. The only options for editors are to reject or send out for review. Rejecting because of the grammar seems overly harsh, but I know that it is a lot more laborious to review a paper with bad grammar. The easiest fix seems to be to try to ensure that the grammar is reasonable before papers are submitted.
Great article. One of the papers it cites is https://arxiv.org/abs/2403.07183, which is also great and looks at the issue of LLM usage to write peer reviews.
It’s an issue I’ve noticed personally, as I’m seeing an increasing number of reviews that lack substance and are almost entirely made of filler content. Here’s an excerpt from a particularly egregious recent example I ran into, which had this to say on the subject of meaningful comparison to recent work:
> Additionally, while the bibliography appears to be comprehensive, there could be some minor improvements, such as including more recent or relevant references if applicable.
The whole review was written like this, with no specific suggestions for improvement, just vague “if applicable” filler. Infuriating.
Funny. It used to be if you received that sort of response, you might imagine the author being pressed for time and giving a sort of prewritten/canned copy response.
I guess LLMs have removed some of the tedium from the process while making it more tedious for the recipient. That's annoying.
Nothing wrong with it. Citing each other introduces more bias than chatGPT anyway. do you expect me to write the same introduction to the same subject for the 17th time? Large parts of any paper are redundant
I couldn't agree more! What am I supposed to do with the related work section? Especially after reading many similar works in the field, it is very hard not to be influenced by what you read but you have to make sure not to say the same thing.
Even pre-LLMs, I’d basically rewrite papers using simpler language when I was a co-author. The baseline was never very good, I’m afraid. For some reason researchers love using ten-dollar words even though their overuse makes the paper read pretentious and obfuscates the meaning. Folks, please read the following essay by Orwell, and then read it again and again until it sinks: https://www.orwellfoundation.com/the-orwell-foundation/orwel...
This apparent language shift is mainly due to Lon’s being trained on all language and not just modern American writing tied to post Hemingway ideals of brevity. Much of the language considered odd is standard British English, like the singular they. Also the use of wider vocabulary isn’t considered bad writing outside the Us where specificity is valued over simplicity.
The figures are about the change rather than the absolute value, so it's not too terrible, but even given that, they could have been normalised by being relative to year 1. A quite warm mess, perhaps?
It's already possible. Upload your text and one you want it to look like, then instruct chatGPT4o to rewrite the first text so it follows the word structure, form and layout of the second one.
Then run the whole lot through Grammarly in academic mode to get rid of flowery words and tortuous structures.
Too many academic papers try to impress with complex language rather than explaining what needs to be explained in a pithy and succinct manner.
Given how many people can't seem to distinguish between LLM output and an actual human, and how many actual humans I've interacted with who I only later realised were actually human, I think it won't be long before humans who can distinguish and write distinctively are a minority.
On thing that is interesting is that they allow you to add additional prompt, because sometimes you don’t want an email to actually sound like you but it to sound like the writer you want to be (leveling up so to speak)
> while brevity and clarity hold significant value in various contexts, they should not be imposed as universal standards at the expense of depth, nuance, and emotional richness. By delving into diverse communication styles and purposes, we enhance our ability to understand and connect with one another, thereby enriching both our personal and intellectual lives.
We live in an era where academics use gpt to spew out papers, and have an audience (not all) who use gpt to summarize and extract the meaningful content from it.
We truly live in the informations age.
On a side note, do people still use ChatGPT to fill out their papers? I found Claude to be way better at spitting out more content. In my experience, ChatGPT has been in the middle while Gemini is the worst, it even cuts-off in the middle of the sentence.
LLMs used in writing of 1/3 of scientific papers. On the one hand scary, on the other it seems people find them useful. Maybe this AI thing is not like crypto after all...
In parts of the world, publishing research papers is required for advancement. The scientific value of the papers is irrelevant.
So yes - AI is seen as useful for people gaming a system for personal advancement at the expense of global scientific progress and the betterment of humanity.
While it would never be published, I imagine if they broke it down by author surname you’d see most of this is non English speakers using a juiced up grammar check.
I see no evidence actual findings are changing.
Lots of people have good science to contribute and suck at writing English. Making findings easier to digest isn’t bad per se.
Eh. Its a tool, just like a calculator, a computer or a drug. Just like everything else that has potency, you can use it well or poorly.
I got some help from ChatGPT while writing a recent paper. At one point while writing, I couldn't find a straightforward way to express myself. I was stuck - and then every attempt at writing the same paragraph came out worse than the last. Eventually I gave my draft to chatgpt and asked it to come up with a few suggestions on how to express my ideas more clearly. That really helped. I adapted some of its writing and ended up with a better final result.
I'm not sure if anything chatgpt wrote is in the final paper. But I really appreciated the LLM assistance to help me express myself.
I'm sure there's a lot of much more egregious examples where chatgpt wholesale wrote large sections of some papers.
But its really not the tool that's at fault. Its how the tool is being used. And we simply don't have social norms around that yet, because figuring out norms takes time.
The calculator was the same. When is a calculator acceptable in a classroom? In an exam? Our teachers at the time justified "no calculator" policies by saying "You won't have a calculator in your pocket when you're going through life!". And, well, that turned out to be very wrong.
So you plagiarized for your paper? I mean, people seem to be in agreement that studios using visual AI generated art that draws from existing works is plagiarism, so how did you not commit academic fraud? Did you consider that encountering challenges with expressing yourself and overcoming them is the pedagogical purpose for papers like that?
I am assuming your paper was for a class, given your comparison to calculators in a classroom. If not, then my point does not apply.
>Our teachers at the time justified "no calculator" policies by saying "You won't have a calculator in your pocket when you're going through life!". And, well, that turned out to be very wrong.
The only time we weren't allowed to use calculators was for things like arithmetic and fraction manipulation. It would indeed be absurd to have to consult a calculator for every numerical step you have to take in math work for subsequent years, so I think your teachers were right.
Well for the studios the art is the product. For researchers the ideas and data are the product and the writing is packaging. It's not a perfect analogy but it gets across why I personally don't get up in arms about this. Provided the findings aren't being changed does it really matter? Sometimes chatgpt is a useful editor for flow in the same way you'd ask a colleague or spouse.
And If it levels the playing field between native English speakers and foreigners that is good in my book.
I don't think I did. And no, it wasn't for a class. Please be more considered before throwing out accusations of academic misconduct. That is a very heavy term.
Anyway, I don't think what I did is any more "academic misconduct" than using Grammerly is academic misconduct. They're both uses of LLMs, sure. But the uses are different and we need to start differentiating them. If you think AI generated art is plagerism, would you therefore conclude that photoshop's content aware fill is plagerism, since it uses the same AI models? Or Apple's smart selection tool?
> The only time we weren't allowed to use calculators was for things like arithmetic and fraction manipulation. [...] so I think your teachers were right.
(Emphasis mine)
Seems pretty strange to assume my teachers were right given you have no idea what country I went to school in or what their policy was on calculators.
I’m certainly not falsifying any findings. I just needed some help finding English words to help make the math easier to understand.
I’m sorry for this, but if some members of this community are ready and willing to make claims of academic misconduct over the idea of using an llm as a writing assistant, I don’t think I trust this community enough to go into more details in this thread.
>I don't think I did. And no, it wasn't for a class. Please be more considered before throwing out accusations of academic misconduct.
Please throw all of my comment into whatever LLM you used to respond, as I pointed out and explained the academic contextual assumption I made. If you are focusing on my phrase "for a class" here, and this paper was actually for an academic conference or journal, then yes! you have committed academic misconduct.
>f you think AI generated art is plagerism, would you therefore conclude that photoshop's content aware fill is plagerism, since it uses the same AI models?
Absolutely, if you are talking about some of the new ones that use stable diffusion and other visual generative models. It might be fine if you're doing it in a commercial setting and those models' licenses authorize you to do so, but using that for a situation in which the art is assumed to be yours (an art class project or for a commission in which the contract states you are producing it) would be fraud.
>Seems pretty strange to assume my teachers were right given you have no idea what country I went to school in or what their policy was on calculators.
Given that you didn't really dispute my point and rather attacked the grounds for it, and that you invented a false premise of your own (that calculator policies are a per-country thing ???), I'm gonna assume your Australian teachers' policy was applicable to my point.
> Please throw all of my comment into whatever LLM you used to respond,
Rude. Rude and wrong.
> If [...] this paper was actually for an academic conference or journal, then yes! you have committed academic misconduct.
This is a surprising perspective given that in many ways I had the same conversation with chatgpt about my work that I might have had with a colleague or a spouse. The biggest difference is that chatgpt's suggested improvements weren't very good, and none of its suggestions ended up in the final text. Is it the product of plagerism?
I absolutely believe that ChatGPT is being overused in academia, just like stackoverflow is in professional programming. But I don't think we have any consensus as to where the line should be between "acceptable use" and plagerism. To me, banning the tool outright would be as silly as banning google search, banning calculators outright in schools or banning artists from using photoshop. (And yes, people made the same argument about photoshop when it was invented - that art made with it wasn't really art because the computer did all the heavy lifting.)
> Given that you didn't really dispute my point and rather attacked the grounds for it, and that you invented a false premise of your own (that calculator policies are a per-country thing ???)
Of course. The grounds for your argument is where it unraveled. You claimed that my teachers had the right calculator policy, but you didn't (and still don't) have enough information to make that claim because you don't know the policy my teachers had.
I would have told you if you asked, but you didn't. You simply went on the attack assuming you had all the facts. And now in this followup message, you have doubled down on your mistake.
For reference, of course the calculator policy at my Australian school in the 90s was different from yours. We generally weren't allowed calculators on the school grounds at all. That started to change right at the end of high school, when they started allowing calculators in class and non-graphing calculators in examinations. (They checked as we walked in the door). But for most of my grade school experience, calculators were banned.
>This is a surprising perspective given that in many ways I had the same conversation with chatgpt about my work that I might have had with a colleague or a spouse.
Yes, it's possible to scoop academic colleagues and steal ideas from spouses. Plagiarism applies there too.
>But I don't think we have any consensus as to where the line should be between "acceptable use" and plagerism.
Perhaps, but using it to generate ideas you claim are yours, let alone quoting it directly, are all well past that line. Unless you put ChatGPT in the author list of the paper, in which case, sure, fair play.
>For reference, of course the calculator policy at my Australian school in the 90s was different from yours. We generally weren't allowed calculators on the school grounds at all. That started to change right at the end of high school, when they started allowing calculators in class and non-graphing calculators in examinations. (They checked as we walked in the door). But for most of my grade school experience, calculators were banned.
Unsurprisingly, this doesn't change or contradict my point at all. Any math you're doing in high school should not require a calculator, or at the very least can be taught without any calculators used.
>Rude. Rude and wrong.
I'm speaking with a tone appropriate for the nature of my company here (an academic fraud and one or two people who support him)
> Yes, it's possible to scoop academic colleagues and steal ideas from spouses. Plagiarism applies there too.
I like the idea that lots of novel scientific ideas were really invented by someone's spouse who doesn't work in the field. One day they happened to look over their partner's shoulder on a whim and said "Oh, honey, surely you mean this?" - and thus the hard scientific problem of the day was solved! Maybe that's how software gets written, too? Perhaps we should add people's spouses to the contributor lists for software? I'm sure many engineers talk to their husbands and wives about their work over dinner. It is simply unacceptable how few spouses end up in the contributors list!
I suspect what's really going on here is that we have different ideas of what constitutes "plagiarism". What is your working definition of that term? How much support would my spouse have to give me before I'm ethically required to add them as a coauthor? Would talking about the topic over dinner be enough? What if they read over my work and pointed out a spelling or grammar mistake? Should AI based code assistants be banned outright (since they are trained on opensource code)? Should code assistants be listed as contributors?
For context, I've worked at multiple universities in multiple countries. And I've spent several years teaching. That has involved telling my students where my line for plagerism is. And getting students in trouble when they cross that line. It sounds like by your reckoning I've been doing it all wrong all this time. So please, enlighten us all. What is plagiarism? How much LLM assistance, exactly, is too much in an academic context? And, if you'll indulge my curiosity, how much experience do you have in universities?
(Sadly, I must also acknowledge that I talked to my partner about this comment before posting it. So many obviously stolen ideas - I'm a fraud down to my very bones.)
I'm not quite sure this follows. At the very least, I think they should also consider the possibility of social contagion: if some of your colleagues start using a new word in work-related writing, you usually pick that up. The spread of "delve" was certainly bootstrapped by ChatGPT, but I'm not sure that the use of LLMs is the only possible explanation for its growing popularity.
Even in the pre-ChatGPT days, it was common for a new term to come out of nowhere and then spread like a wildfire in formal writing. "Utilize" for "use", etc.