Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
How to count every language in India (2018) (atlasobscura.com)
100 points by haltingproblem on Nov 1, 2020 | hide | past | favorite | 124 comments


I come from a Sourashtra speaking community. We’re a small bunch of people immigrated from Gujarat to Southern parts in different waves during the Middle Ages, mostly because of war and trade.

Our language has been preserved though the common speak and has no agreed upon script, so it is majorly taught to the children by their parents and relatives. Most of us know Tamil/Kannada/Telugu because of the main language in the place we live in and also use it to publish local magazines.

The magazines though are published in Hindi, Tamil, Sourashtra scripts, the interest is declining at a fast rate. Fewer people want to buy and even fewer want to write. In my city, some are distributed for free only to see it being used for packing pakodas and samosas. Another factor is, many in our community marry people from outside the communities (I'm not against that) and their children can barely understand the language. Our language is still not in the endangered category, but I see it becoming one in the coming decades. More people in our community are getting educated, becoming modern (by western standards), want their children to communicate in English and as a result most people have become middle-class to rich. The reasons for our endangerment is partly because of modernisation.

I'm seeing the coming generations have lesser and lesser interest in growing the language as my Grandpa rightly feared, and will be one of the ones forgotten by the end of this century.

https://en.m.wikipedia.org/wiki/Saurashtra_language


I have many friends who speak Sourashtra. It is always exciting and interesting to see how much they love their language and want to keep it in day-today interaction in house hold. I think any language that is spoken in kitchen and in dining table will survive long. Sourashtra will to!


Your comment warms my heart. I’ll try my best to teach my nieces and nephews at least, hoping they carry the baton in the future.


The notion of an 'old language' is weird. What does that even mean? Why is Bo one of the oldest languages? Is Greek old? Is English old? Is Lithuanian old? Is Tamil old? Korean? Chinese (which one?)? Any language? If your answer is 'yes', then (a) please explain what you mean, and (b) rest assured that it is not the same language today as when it was 'young' (whatever that means) -- 'young X' is probably a different language than 'old X', but what does that mean for 'X'? I don't get it.

Such a weird notion. It seems to me that 'old' for a language is often used to imply that it is 'important' (culturally? presumably yes.), which feels, well, not racist, but maybe 'linguist' -- OK, that word is already occupied... Even fighting exists about which language is 'oldest' -- could you please make up the rules of the game before playing?

Anyway -- this is just me not understanding what people mean.


As the article says pre-neolithic, if you are not aware Andaman is also home to the sentinelese, a known isolated and uncontacted tribe who have preserved their way of life for 60,000 years of isolation.

Oldest language could be a distinct language that has preserved its uniqueness. Due to the ice age, European languages can not be the oldest.

The isolation of these tribes and the long human habitation in these regions could make the languages some of the oldest as well as primitive.


> the sentinelese, a known isolated and uncontacted tribe who have preserved their way of life for 60,000 years of isolation.

I don't think they have actually been isolated for 60,000 years though. If I understand correctly, they chose to isolate in the 1800's because the British were kidnapping people. Before that they had contact with other Andaman islanders at least. For example, apparently the Sentinelese may speak a language of the Ongan family[1], which would mean that their language most likely diverged sometime within the last millennium or two, I think.

[1] https://en.wikipedia.org/wiki/Ongan_languages


I hate that it is trendy nowadays to claim that everything is racist. It is just ignorance.


People reuse old words for concepts that don't have their own names. Racism is the most common word for "using pseudoscience to give one group higher status than another"


I think what people who talk about "old languages" use to determine a language's age is something like "what's the earliest date an ancestor of this language was used in writing" or "how long has this area been inhabited by populations genetically similar to modern speakers of this language".

Personally, I think most languages are old, except for constructed languages like Toki Pona that have only existed for a short time.


Well very old languages like Tamil have a very ancient person with whom a modern speaker could have a mutually intelligible conversation. An English speaker could not speak English with someone from 1000 years ago but a Tamil speaker could.


Arabic. I can read/write Arabic texts from 1400+ years ago. As could anyone who studied Arabic at school... Now that I think of it, that’s pretty damn amazing! I can’t say the same for French or English.


Well, the Koran had an enormous stabilizing influence, didn't it?


How is that measured?


Well measuring it exactly is impossible but relatively is easy. For example Tamil has been much more static than Greek which has been much more static than English over the last 1000 years.


English was not really English 1000 years ago, was it? I think a Dane would find Old English/Anglo-Saxon easier reading than an untrained native English speaker would.


Static how? Obviously there is no ancient audio. Is ancient Tamil writing readable by modern Tamil readers?


Written scripts have used to change all over time. It is not possible to read at the get go, but after some practice, yes, any one can read old Tamil. Given that about ~55% [1] of written inscription are from Tamil, it is one of the proficient.

How do you measure the language change [2]? There are several approaches, the one I remember reading about is, you fetch the most basic kernel of a language, like very simple key words (like relations, food, feelings and so on) for about 100 words and then see how many of those words have changed since say last 10 years, last 50 years and 100 years and so on. You try going back as long as you can and you loose few words from 100 words you started, that's you delta.

1. https://en.wikipedia.org/wiki/Early_Indian_epigraphy 2. https://www.ling.upenn.edu/courses/Fall_2003/ling001/languag...


> It is not possible to read at the get go, but after some practice, yes, any one can read old Tamil.

The "after some practice" makes this a bit of a squishy claim. For example, Malayalam and Tamil split less than 1000 years ago and are both descendants of Old Tamil. I would be surprised if "after some practice" a Malayalam speaker couldn't also read Old Tamil, or for that matter Modern Tamil (and vice versa for a Modern Tamil speaker learning Malayalam).

So is Tamil somehow older than Malayalam? I think not.


There are many methods to try to determine how an ancient language was spoken, all of them by definition hypothetical, but since linguistic change does follow sensical rules, and happens in gradual steps (i.e. you can almost be sure that the vowel /u/ won't shift to the vowel /a/ in a single generation, and I bet there isn't a single recorded case of that happening, simply because /u/ and /a/ are very far away from each other in the mouth and acoustically).

For instance, we have very solid evidence (but still circumstantial, and therefore this is technically an hypothesis) that the Greek letter η, that now stands for /i/, had a long eh-like sound, likely /ε:/, for this and due to many other changes, we can say that were a Modern Greek speaker to speak with a Greek from the 5th century BC, they would have understood barely anything, barring certain specific words. From tracing how the language changed, according to the extent evidence, we can probably say that this speaker, however, would have mostly (but with difficulty) understood a speaker from 12th century Constantinople in simple day-to-day conversations. At the very least, the two speakers would have the same sound inventories and similar grammar, but somewhat different vocabularies.

I still agree with you that saying one language is older than another is meaningless. At most we can say that a language is probably more conservative than others.


Fortunately, there are Tamil poems written around 200 CE that are still readable and understandable by native speaker. Tamil poems follow strict grammar and structural rules for poems, so they don't change over time (see: https://en.wikipedia.org/wiki/Venpa).

Here is poem that speaks about children being joyous, and how they playfully eat food. A simple ordeal in everyday life. Its about 7 lines and everyday speaker still understands all of it.

https://365paa.wordpress.com/2011/09/06/063/ (poorely translated: https://translate.google.com/translate?sl=auto&tl=en&u=https...)


Nice poem. thanks for sharing


Malayalam split from Tamil around 1000 CE. When I said you might need some expertise and training to read old Tamil scripts, I mean this: https://www.youtube.com/watch?v=aWIs17rSCg0&list=PLAU5iw78o0.... The video has closed captions if it is difficult to follow. Hope it helps.

Written Tamil have continuously evolved all along, the recent being around 1950 when printing became prevalent. The language itself changed very little, however only written letters survive and to know what was written 1000 years before you need some training.

Expert opinions might be different. Hope I am getting some information.


To give a sense of the differences the numbers one through ten in written tamil:

ondru, irandu, moondru, nangu, aintu, aru, ezhu (prounced like eru), ettu, ompattu, pattu

in modern:

oru, rendu, moonu, nalu, intu, aru, elu (or eru), ettu, ompati, pattu


"Modern Tamil" is actually classical Tamil. The written language is basically fixed in form to the language of the first millenium CE (the Thirukural, a collection of aphorisms by a Jain monk from the 3-4th century is relatively understandable to modern readers of Tamil). The spoken language continued to develop and depending on your dialect is completely different! It's the equivalent if english speakers spoke modern english, and wrote like Beowulf. It is possible and common to be a native speaker of Tamil and not be literate. The situation is similiar in many other languages like Arabic. In general Malayalam speakers are not literate in Tamil, because classical Malayalam, though very similiar to classical Tamil gramatically, has a heavily Sanskritic vocabulary, while classical Tamil tends to not use sanskrit words. Incidentally, the earliest Tamil, the Cankam literature, is completely different, and was already incomprehensible by the end of the first millenium when commentaries were written to effectively translate it.


While we don't have ancient audio, we can learn something about how a language was spoken by analyzing poetry with strict pronunciation rules. There's research in Chinese (I believe the original researcher was named Chen Li) to reconstruct how Chinese words used to be pronounced by analyzing poems that were supposed to rhyme but no longer do.

https://en.wikipedia.org/wiki/Historical_Chinese_phonology


Yes


Thirukural[1] is a classic Tamil text on philosophy and secular ethics, which is considered to be at-least 1500 years old. Many verses of this text can be easily read and understood by anyone with intermediate level proficiency in Tamil. And so are most text from Sangam literature[2], which are even older. Thiruvasagam[3] and other Bakthi literature are widely used as a prayer hymns and can be understood with basic language training.

Tamil has two forms. Sentamil and Koduntamil, both survived till date. Sentamil is the formal/literary form and Koduntamil is the colloquial form used by common people. Koduntamil is made of simplified and shortened forms of words from Sentamil, some loan words and slang words. Due to the cultural and political influences in the last few centuries, there are many loan words from European and Persian languages in Koduntamil.

Other Dravidian languages[4] such as Telugu, Malayalam and Kannada has heavy influence of Sanskrit. There were efforts to Sankritise Tamil as well by using Sankrit words in Tamil syntax, which was called Manipravalam. Manipravalam was in use till early 20th century. In the beginning of 20th century, Tanittamil Iyakkam[5] aka Pure Tamil Movement restored the language and Maripravalam is no more in use. The formal Tamil used today, such as in news bulletins and official documents, is very close to the classical form.

Pure Tamil Movement, the two language policy[6] adopted by Tamil Nadu government and related efforts helped preserve the language. Although the Tamil speaking population is lesser compared to other languages such as Hindi and Telugu, Tamil has a thriving art, literature and media industry which are economically successful too.

Recent archeological research at Keezhadi has revealed a 6th century BCE Sangam era settlement in Vaigai river valley[7]. If more research confirms the findings, Tamil would be one among the oldest surviving classical languages.

[1] https://en.wikipedia.org/wiki/Tirukku%E1%B9%9Fa%E1%B8%B7

[2] https://en.wikipedia.org/wiki/Sangam_literature

[3] https://en.wikipedia.org/wiki/Thiruvasagam

[4] https://en.wikipedia.org/wiki/Dravidian_languages

[5] https://en.wikipedia.org/wiki/Tanittamil_Iyakkam

[6] https://www.thehindu.com/opinion/lead/policy-lessons-in-tami...

[7] https://www.thehindu.com/news/national/tamil-nadu/keeladi-fi...


This politicization and purification drive has made tamil to some extent less usable. Far from preserving tamil, many in the current generation can not read, write or understand tamil.

It is lacking in modern technical literature and comics. Most kids in school memorize tirukural, without understanding a word. Pushing the burden on preserving old language on kids is actually driving them from the language itself.


> This politicization and purification drive has made tamil to some extent less usable. Far from preserving tamil, many in the current generation can not read, write or understand tamil.

Can you elaborate on how you concluded that Tamil has become less usable? Citing sources will be helpful.


I have to kind of agree with the parent comment. I lived in Madurai for 3 years, where the people take great pride in the language. Your anecdote of little kids being able to comprehend the Thirukural was oft repeated, though I failed to see it in action. There were couplets from the Thirukural painted inside the city bus, and not once could one of my Tamil classmates actually grok the text. As a non-Tamil who learnt the script, I could read the words about as well as they could, transcribed as it is in the modern Tamil script. They clearly understood more than I did (which was close to none), but not enough to comprehend the text in any meaningful way. They could understand a word or two of the couplet, and kind of guess the meaning of the text. This was a far cry from the trope of 3 year olds reciting the Thirukural. For context, these were Masters students, for whom Tamil was the primary language. In my class of 15 people, there were perhaps two or three who could comprehend these texts with any level of competency.


Learning Sanskrit and loving it. Chanting sanskrit mantras relaxes my facial muscles and feel lightweight on forehead.. might sound dramatic to some, but it's true....


It takes a couple of years for the pain to really begin!


What pain you are talking about? Can you elaborate?


Having friends and families chanting it for generations, never heard of any such pain.


There is a saying in sanskrit that you can either rule a kingdom or learn grammar. Learning sanskrit is a long road and its not an easy one.


Long but pleasent journey, I must say.


Any specific mantras ?


Gayatri Mantra would be a good start. Another one is "Om namah Bhagwate vasudevaya."

YouTube has many if you just want to listen couple of times to remember them.


Check vishnu sahastranam too.


It's a pity there's no (link to the) definition of language they used. With so many, where do you draw the line?


> With so many, where do you draw the line?

The whole issue of language v/s dialect is a complex one; but I have adjusted to the one in this popular article [1]: if you speak a language and can enter into a conversation with a speaker of a variation on that language relatively easily, it qualifies as a dialect.

> A Mandarin-speaker can no more “adjust” to Cantonese than a Swede could “adjust” to German.

Most of the well-known Indian languages actually fail this test. Someone who has grown up speaking Hindi cannot reliably converse with someone in Bangla, Sambalpuri, or Tamil. If anything, the word "dialect" is overused regarding India – Bhojpuri [2] was often mentioned as a dialect of Hindi; but as a fluent Hindi speaker I cannot claim to understand or have a fluid conversation with someone speaking Bhojpuri. I am glad that it is at least sometimes referred to as a language now.

----------------------------------------------------

[1] https://www.theatlantic.com/international/archive/2016/01/di...

[2] https://en.wikipedia.org/wiki/Bhojpuri_language


That would make Hindi and Urdu dialects of each other. I'm not saying this is necessarily right or wrong; it's a landmine I don't want to touch :)


Urdu is definitely a dialect of Hindi. Speakers of Hindi, and of Urdu, can understand most of what is said and can have a conversation for most part. They may not understand 100% of the other language, especially when words that are closer to their Sanskrit or Persian origin are used and have not become part of the colloquial other.


Why is it a landmine? I am an Urdu speaker, and totally fine with realizing that Urdu is the Persianised standard register of the Hindustani language, and Hindi is the standardised and Sanskritised register of the Hindustani language.


There is some issue with priority... modern Shudh Hindi (the standard literary dialect) only dates from the 19th century when it was essentially invented by a number of Hindu intellectuals mostly in Benares. Urdu is older, along with the lovely Hindi dialect Braj Bhasha (both dating from the mid 16th century onward). The earliest Hindavi writing is Kabir, from the the 13th century. Also, Panjabi is a complex relation, since it's also the same language, filtered through the gilded pen of Guru Nanak.


It's more appropriate to think of Hindi (Rajbhashya/Khariboli) & Urdu as different registers of the same language.

For example, a lawyer & a rapper might both compose their works in English, while adopting a register & jargon geared towards their respective audiences. Their works might very well be largely incomprehensible to each others target audiences.


It is true Urdu is just a colonial imposition, it is synthetically derived from Hindi by insisting that the script be Persian.

Tarek Fatah makes a point about why Urdu isn't a distinct language from Hindi.


Yiddish is written in both hebrew letters and latin letters, but is one language. Language speakers don't care too much about the script when evolving their language; they use what they have.


I think Hindi/Urdu is quite a bit like Serbian/Croatian. The preferred script followed the religious divide, but (if I understand right) both sort-of standardised quite similar dialects out of a range of possibilities. And then later they were politically divided, and there were some efforts to move the official vocabulary away from each other's.

IIRC Sanskrit has also been written in dozens of scripts, there's a perfect one-to-one transliteration into most (maybe all) the north-indian scripts, although they have extra characters for their own sounds.


Since all indian scripts besides the Urdu Nastaliq were originally developed to write sanskrit, the transliteration is indeed perfect!


Bhojpuri, Maithili, Haryanwi, Braj, Bundelkhandi, dialects from Rajasthan, Himachal and so and so forth, and I'll say even Khari boli are all dialects of Hindi. We may not be able to have a fluent conversation, but for most part we'll understand the context and will be able to speak. In fact Hindi dialects change every 100km or so I'll say in the Hindi belt.

Punjabi, Gujrati, Marathi and others (picking from areas neighboring the Hindi belt) are not dialects, they are languages, even though from that list Gujrati and Marathi use Devnagri script (or very close to it).


More likely, people speaking different dialects will adopt a standard register to communicate with each other. For example, I have seen Himachali speakers from neighbouring districts communicate with each other exclusively using Khariboli Hindi, rather than their native Himachali dialects.

This is probably true of most large, diverse languages. Each dialect has drifted sufficiently far from each other than while it is possible to communicate with other dialects, it's most convenient to use the shared formalized standard register. So while a Malvani speaker can largely comprehend the dialect of somebody from Gadchiroli, more often both speakers will switch to the standard Marathi register they learnt in school.


hmm American here - but I vaguely recall.. Gurmukhi is a variation on Punjabi, used by Sikhs for religious purposes, but Urdu is a 'persianized' version of Hindi (?)

Checking wikipedia, it mentions a "Punjabi diaspora family of languages." These distinctions seem understandable and interesting to me, of European descent and non-political.

thanks yumraj, for the many posts here today on this, it feels important at some basic level.

I enjoyed reading "In Search of the Cradle of Civilization" long ago.. and studied personally with one of the authors, in California.

https://en.wikipedia.org/wiki/Subhash_Kak#In_Search_of_the_C...


Gurmukhi is the script used to write Punjabi, at least in India. It was created by a Sikh Guru. Link: https://en.m.wikipedia.org/wiki/Gurmukhi

Interestingly, Pakistan has a state called Punjab where Punjabi is the main language, but I’m not sure what script they use.

Regarding Hindi and Urdu, I found the wikipedia page to at least be informative. Link: https://en.m.wikipedia.org/wiki/Hindi%E2%80%93Urdu_controver...


Pakistani Punjabi uses the Shahmukhi script, which is based on the Perso-Arabic script.


Thanks. So, Punjabi must be one of those rare languages that is written in two entirely different scripts by roughly equal population sizes.

Are there any other examples of such a language?


There are, with many more examples historically.

Konkani in the Devanagari script is an official language of India & the state language of Goa - while Christian Konkanis (a sizable proportion of the population) prefer to write it in the Latin script.

Kashmiri Pundits used the Sharda script, while Muslim Kashmiris wrote it in the Persian script.

Historically, the picture is more complicated. People were familiar with multiple scripts, and shifted across them based on context and jargon. For example, when the British took Sindh from the Nawab, they found the natives writing the same language in multiple scripts. The administration of the erswhile Nawab worked in Persian, while merchants used Khudabadi, a landa script related to Gurmukhi. They found that women in the home preferred the Devanagari script, as did the priestly class. Since the British were most concerned with the needs of the Persianate clerks, they mandated that Sindhi only in the Persian script would be taught in schools. You will find similar situations across India, with people adopting scripts geared towards the audience they were writing to, rather than an exclusive formalized script unique to a language. Literate Medieval Indian would be fluent in multiple languages and scripts, and combined them in interesting ways.

On a tangent, I recall an argument by Bhagat Singh that the different languages of India should adopt a common script to foster unity - Devanagari in his opinion.


Gujarati has its own script, fairly similar to Devanagari but without the horizontal bar above letters and a few more changes. Marathi uses Devanagari.

I believe both Urdu and Hindi came from Hindustani which was derived from Khari Boli.


On the issue of scripts, the matter was more fluid some generations ago. Different people would adopt different scripts in different circumstances, and it was the British push for standardization that settled the matter.

For example, when the British took Sindh, they found that officials of the erstwhile Nawab administration used the Persian script, while merchants used the Khudabadi landa script. Women and priests, however, largely used the Devanagari script. The British adopted the Persian script for Sindhi since the clerks they absorbed used it.

A similar story played out with Marathi. From the time of the Marathas till as late as the 1920s, Marathi was written formally in the Modi (ie, twisted) script. The British pushed for the adoption of the Devanagari script, since it would simplify logistics at printing presses.


Italians can understand Spanish quite well and vice versa. Similar for Portuguese. That doesn't make them dialects of Latin.


I can understand some Punjabi, and given context might be able to guess Gujrati, Marathi, Bangla due to their origin in Sanskrit, but I did call them separate languages.

The ones I mentioned as dialects are indeed dialects, at least per commonly accepted standards.


That definition does not work with Czech and Slovak.


Why? Aren't they rather close, and more of a language family than two distinct languages?


They are distinct languages from the same language family, which are understood by speakers of the other language, but not spoken. Czechs understand Slovaks and vice versa, but both are talking in their own languages, and speaking the other language can't be done properly without studying it.


At least with Hindi and Tamil, you can be sure because they come from different language families.


True. This may sound obvious to you, but I am not so sure about its obviousness in general in the West.

A lot of western-educated people have asked me “So Hindi is the main language, and there are a lot of different dialects throughout the country, right?” – and I have no idea where to begin explaining that it's nothing like that, so I go with the time-tested “Well, it's slightly different...”


> and I have no idea where to begin explaining that it's nothing like that

(I have studied Sanskrit and Hindi in school for multiple years, and can understand a good amount of Marathi and Tamil.)

You can say this - I think it is true, not quite verified (partly heard from older relatives, partly deduction based on observing similarities in the mentioned languages' vocabularies):

Sanskrit came first - thousands of years ago. It is the language of the Vedas, 4 of India's most ancient scriptures. Many northern (including east and west) Indian languages are similar to Sanskrit and likely are descendants of it. Hindi, Gujarati, Marwari, Punjabi, Bengali, Oriya (not sure, but it sounds some like Bengali), Marathi, are some of these.

Languages of Indian states further east than Bengal, called the North Eastern states, like Assam, Nagaland, Manipur, etc. may not show a lot of Sanskrit influence because they may have Oriental influence instead, such as from SE Asia, China, etc. Except Assamese may have some Sanskrit influence, also not sure of Manipuri.

And of the southern states in the peninsula, Tamil has the least Sanskrit influence and is also said to be very ancient. The other three, Malayalam, Kannada and Telugu, have some Sanskrit influence.

Caveat: I am not a language expert.


Bengali is very much a descendent of Sanskrit. Assamese (and Oriya) is essentially a somewhat distant dialect of Bengali. Most languages of Asia that aren't part of the Chinese world (Chinese, Japanese, Korean, and modern Vietnamese) have tons of Sanskrit or Pali (or both) in them. Tamil has lots of sanskrit in it (good examples are naka meaning finger nail, singa meaning lion, ampala from ambara meaning sky) it just has less than other dravidian languages, presumbaly because of its much older literary tradition. Literaly Malayalam looks literary Tamil, with all the words converted to Sanskrit. It's kind of funny!


> Bengali is very much a descendent of Sanskrit

Yes, that is nearly same as what I said above, except I said "likely", because I was not sure, not being a language expert. And that "likely" applied to the other languages I mentioned too, in that sentence.

Edit: Actually, I am (still anecdotally) pretty sure of it for at least Hindi and Marathi, because both of them have too many words which are the same as, or derived from Sanskrit, for it to be a mistake.


Assamese is NOT a dialect of Bengali. It is older than Bengali and infact the first non-Sanskrit language to which Ramayana was translated to. The Assamese script itself is in a continuous development since 5th century.

https://en.m.wikipedia.org/wiki/Umachal_rock_inscription


The earliest extent translation of the Ramayana is the Tamil Kampan Ramayana from the late 12th century. The might have been a Khmer translation earlier. The Assamese Kotha Ramayana is about two hundred years later. The Jain Ardhamagghadi Ramayana is very old, perhaps older than the Valmikiramayana!

I stand corrected about Assamese's relationship with Bengali.


No worries. I meant in the Indic language family, not Dravidian. The Tamil version definitely preceded that.


And Tamil have long tradition of resisting Sanskirt influence. With elaborate grammar rules dictating if you must use loan words when, where and how to use them. It is more true for literary setting and never true for day-today speaking. Also, Tamil words also made into Sanskrit, don't know what it means. Languages are basically for interchanging ideas. ¯\_(ツ)_/¯


I mean... Tamilians sort of dominated of dominated Sanskrit literature in 7th-10th centuries. Dandin, Kumarila, Shankara, Dharmakirti, Ramanujan and Madhva. Not to mention the Bhagavatapurana, which very quickly became probably the most important religious text in India! I wouldn't say Tamilians rejected sanskrit... they just kept the two seperate. There aren't very many tamilian words itself in sanskrit (mina meaning fish is the only one I can think of), but there are proto-dravidian loan words in Sanskrit.


Most of the people in that list were likely not Tamilian - Shankara was likely Malayali, Kumarila & Dharmakirti are just as likely to have been from Assam, and Madhva was Tuluva.

What about the Bhagavatapurana is Tamil?


It literally quotes the Azhvars... you're right about Malva, that was a mistake. I don't agree with a possible assamese origin of either Kumarila or Dharmakirti, and see no reason to doubt Taranatha, and in the 9th century, the distinction between Malayali and Tamil is academic.


"it's slightly different" confuses the issue.

It seems easy to say, "actually, no, India has multiple unrelated language groups as different from each other as English is from Hindi, and I don't just mean in isolated villages"


I usually explain it as being similar to the differences in languages across Europe. If you take portuguese at one end and russian at the other and start tracing across countries, you'll notice certain similarities and patterns but also encounter a lot of unique character especially in some areas. Spanish and portuguese people might be able to understand a little of each other like tamil and malayalam speakers and so on.


"The situation is a lot more nuanced than that."


Unknown attribution, I've read that a language is a dialect with an army and a navy. Not entirely applicable here though, as there are likely geographic and historical separations at play too.



You can find more information on their website.

http://www.peopleslinguisticsurvey.org/


In 1932 https://en.wikipedia.org/wiki/Communal_Award British regime offered separate electorates/countries to Muslim/Sikh/Christian/Parsi/Buddhist/Jain/SC/ST where Upper caste/Brahmin/Bania/Kshatriya can live on a Visa


I was hoping an AI would analyze every post in the .IN domain.


Many people who speak some rare languages in India do not have the tech skills to put their language online.

Some languages are already dead, except for academic purposes.


While it's not close to a complete solution, I'm glad that we have Unicode to help.


As mentioned in the article, the real loss is the world view of the people. Culture, folklore and history would be lost.


(2018), please.


[flagged]


> The politics behind promoting Hindi is a religious one. It makes convenient to promote the varna system(modern caste system).

As someone with some knowledge around this subject, this is a simplistic and inflammatory bad-faith reading of a very complex topic – how do you create a common country out of hundreds of splintered cultures, languages, and religions spoken by millions of people? The nearest parallel to how many languages India is the European Union, and I would wager that given the differing scripts and such, India's diversity of languages exceeds that of the EU's.

And people have been trying to unite this complex mixture as a country since the 1700s! (thinking of the British attempts here; much earlier if you include pre-British empires).

The imposition of Hindi has been the one of the (failed) ways to have such a common language. It is quite a vast topic, but one example is that Mahatma Gandhi himself started the South India Hindi Outreach Institute to promote Hindi as a means of integration between (broadly speaking) North and South India.

Also, the issue of caste in India is a pretty complex one as well, but having spent some time in south India, I can confidently say that casteism is way more prevalent there than Hindi, which casts some doubts on the effectiveness of the politics you allude to (even presuming it exists).

----------------------------------------------------

[1] https://en.wikipedia.org/wiki/Dakshina_Bharat_Hindi_Prachar_...


No discussion of India is complete without dragging caste into it, just like no discussion of Germany is complete without the holocaust.

However, unlike the holocaust "Caste" and the "Caste system" originates in Europe. It is time that the textbooks are updated to reflect reality and not carry on a colonial-evalangical propaganda.


Caste matters everywhere: schools, jobs or life at large

https://www.livemint.com/Opinion/wquRqpqgRGfDP58kN9BNOM/cast...


Not every discussion of Germany needs to bring up the Holocaust, just like not every discussion of India needs to mention caste.


It was a joke, because every thread on hn about India, has a motivated group bringing out caste.

Just like every Hollywood movie involving Germany, has to have a hitler reference.


> The politics behind promoting Hindi is a religious one. It makes convenient to promote the varna system(modern caste system). This is the reason the religious party and organizations(BJP, RSS) promotes Hindi and forces in to the education of all non-hindi speaking states too.

Rest of your comment is factual, except not sure if Tamil is the oldest language or Sanskrit or some other is.

But the above quoted part of your comment is utter BS.

The reason to promote Hindi is nationalistic rather than religious. Why should the lingua franca of any country be a foreign language, it should be a local language. Hindi fits the bill amongst all other Indian languages is because it has the largest footprint and hence best odds of success, as opposed to say Tamil which is primarily spoken in 1 state.


> Why should the lingua franca of any country be a foreign language, it should be a local language. Hindi fits the bill

My understanding is that this is (or was) contested in the south. While Hindi is (as you say) large and local, it's also specifically northern. (The southern languages are very different, non-indo-european.) The counter-argument for English is that, while foreign, at least it's equally foreign all round.


Yes I believe that is correct. The opposition is mostly political in nature.

Also, only Tamil has a non-Sanskrit origin and is completely different. Other South Indian languages do have Sanskrit origin/influence.

Which is possibly why Tamil Nadu is the most anti-Hindi state.


> Also, only Tamil has a non-Sanskrit origin and is completely different. Other South Indian languages do have Sanskrit origin/influence.

Recent scholarship around this paints a very different and complicated picture of interactions between Dravidian and Indo-Iranian languages. Specifically, there is evidence that Indo-Iranian speakers and Dravidian speakers exchanged vocabulary even before Sanskrit and Tamil branched off from their parent languages.

From Rice in Dravidian by Franklin Southworth [0]:

> The reconstructed vocabulary of Proto-Dravidian depicts a society whose main source of food is animals and their products. Though the speakers of Proto-Dravidian had some knowledge of agriculture, the absence of reconstructible names for cereals strongly suggests that they were not sedentary farmers at this stage—rather, they were probably herders of sheep and cattle, supplementing their diet with grains and other agricultural products through gathering, trade, and occasional periods of sedentism. The first association of Dravidian speakers with a specific cereal is signaled by the occurrence of the word gōdhūma “wheat” in Vedic Sanskrit, a word derived from a Proto-Zagrosian word meaning “stand of grain”. This word, along with other Dravidian-derived words relevant to agriculture (words for “fruit” and “field, threshing ground”, “joint of plant”) and animals (words for “sheep”, “ass”, “hoof”) suggests a significant role for Dravidian speakers in Indus Valley agriculture. As the distribution of these words in modern Indo-Aryan languages—with cognate forms in Dardic and/or Nuristani languages—is similar to that of inherited words like Old Indo-Aryan “brother” (4 Nuristani, 16 Dardic cognates) or “mother” (four Dardic cognates), it is reasonable to assume that the initial contact between Indo-Aryan and Dravidian speakers took place outside of the Indus Valley, in parts of Iran, Afghanistan, or Turkmenistan, probably at a time when Dravidian speakers were involved in wheat cultivation. Chronologically, this might have been anywhere “from about 3500 BC onwards”, in Bellwood’s words (2005:334, see also Bellwood 2009). The presence of cognates of some of these loans in Iranian languages suggests that both Indo-Aryan and Iranian-speaking groups may have been involved in early interactions with Dravidian speakers.

> By the mid-third millennium BCE, Dravidian had become the dominant language of the Southern Neolithic; the Dravidian languages had split into North Dravidian and Peninsular Dravidian, and the latter was further divided into four subgroups. The Southern Neolithic was “…a system that combined animal husbandry—including the well-known cattle pens (Allchin 1963)—with a new package of crops, of which the staples were legumes and millets, a description applicable to the diet of most South Indians today” (Southworth and McAlpin 2012, unpublished). Its main cereal crops were two millets, bristley foxtail (Setaria verticillata) and browntop (Brachiaria ramosa), whose names were either taken from some unknown language(s) or derived from existing Dravidian words such as ko rr - “food” DEDR 2171. Since South India is not a suitable climate for wheat cultivation, the old word for “wheat” was no longer even a memory, surviving only as Tamil kūlam, a generic term for “grain”, and a few related words in Gondi-Kui (DEDR 1906). When Dravidian speakers today refer to wheat, they use a form borrowed from Sanskrit godhūma, such as Tamil gōdume.

> Contrary to Southworth’s claim (2009) that the language of the Southern Neolithic was a form of “late Proto-Dravidian” (equivalent to what is now being called “Peninsular Dravidian”), it now appears that the Southern Neolithic involved a coming together of various subgroups of Proto-Peninsular Dravidian which had already separated some time previously, had been in intermittent contact over a long period in the interim as each branch developed its own unique crop package, and ultimately pooled their resources to create the collection of crops identified in the Southern Neolithic.

> Regarding rice: the Proto-Peninsular Dravidian word vari-(n)ci, which probably originally had the meaning “seed, grain”, appears first as a word for “barley” in two Nuristani languages. The contact that produced the Nuristani words is perhaps dateable to the same period as that of the word for “wheat”: any time from about 3500 BCE onwards. Lacking evidence to the contrary, it seems reasonable to assume that Dravidian-speaking farmers became part of the population of the Indus Valley from about that time. That they retained that role for a long period of time is shown by the appearance of the OIA words vrīhi “paddy” and “threshed (husked/unhusked) rice” in about 1200 BCE, and by the later replacement of these words by another Dravidian loanword, cāmala/cāvala, a millennium or so later. Additional evidence may come from other names of cereals: for example, Dravidian *co nn al “millet” is the probable source of the late Sanskrit yavanāla (reshaped under the influence of OIA yava “barley”), the source of modern Indo-Aryan words for “sorghum” such as Marathi

[0] https://link.springer.com/article/10.1007/s12284-011-9076-9


Thank you for a high-quality source! I am surprised to learn that phala “fruit” has Proto-Dravidian origins, it is used quite frequently in Sanskrit shlokas in both the literal and figurative (“reward”) senses.

Even though the link seems to imply that it's more Elamitic than Dravidian, the case of Brahui [1] is very interesting as an instance of a Dravidian-linked language that's very different in origin from its geographical neighbors.

----------------------------------------------------

[1] https://en.wikipedia.org/wiki/Brahui_language


You might be interested in David McAlpin’s 2015 paper about Brahui, Elamite and Proto-Dravidian [0].

Some context: Brahui is a language spoken in Pakistan that’s been a bit of a mystery till David McAlpin solved it with the proto-zagrosian hypothesis.

Brahui clearly looks like related to Dravidian, but didn’t fit into proto-dravidian or any reconstructed daughter languages of proto-dravidian. Simultaneously, the ancient language of Elamite looks a lot like Dravidian. (sidenote: there are cuneiform tablets written in elamite! Elam shows up prominently in the fall of Nineveh, the capital city of the super power of that era!)

The problem was chronology- Brahui is clearly older than the later descendants of Proto-Dravidian. Elamite is far older than the descendants of Proto-Dravidian. The question was how these two languages fit into the Dravidian family tree.

McAlpin’s 2015 hypothesis answers this question with a surprising and elegant idea: What if Brahui is related to Elamite instead of Dravidian?

He found that the pieces fit really well when you treat Elamite and Brahui as sibling languages descended from a parent language: Proto-Elamite. Next, he finds that Proto-Elamite and Proto-Dravidian cleanly fit as siblings descended from a mother language that he calls Proto-Zagrosian.

I’m not a linguist and I don’t know how well accepted this idea is among academic circles - but looking at this from outside, it seems to be a fairly well respected idea that’s being seriously considered by some academics.

PS: I did a google search for Proto-Zagrosian and the second result was my own comment on HN from a long time ago. I really wish there was more people writing/blogging about this.

[0]. https://scholar.google.com/scholar?hl=en&as_sdt=0%2C5&q=prot...


That is interesting indeed! I skimmed over the paper, although most of it is too dense for me to understand.

A proven connection between Dravidian and Elamite would kind of upend most perceived ideas in India about Dravidian versus Indo-European languages (namely, that Dravidian ones are more indigenous than Indo-European).

Also, this is very likely just a coincidence, but I realized that the Tamil word for homeland is...Eelam [1].

-------------------------------------

[1] https://en.wikipedia.org/wiki/Eelam


I thought that Southworth and McAlpin were kind of out there on their own with this hypothesis.


> Hindi fits the bill among st all other Indian languages is because it has the largest footprint

Not true. Even if it were think about the following analogy given by a national leader. I dont remember his name. There are more crows in India than Peacock. Yet Peacock is the national bird not the crow.

If there must be a national language then it should be English. Simply because it is the one that has prospered the coutry. Hindi does not help the states where hindi is the primary language. Bihar people migrate to south india for jobs and learn tamil and telegu, while the south indians leave india to abroad.

Preach Hindi to the people who are still low class but you learn english conveniently, thats evil IMO.


> Why should the lingua franca of any country be a foreign language, it should be a local language

Hear, hear. But good luck convincing Americans to learn the local indigenous languages or go back where they came from...


This is just nonsense. Hindi has nothing to do with caste system. Hindi was promoted because of the need for a common language for a nation with 300+ official languages (not counting the dialects) - as it's the language that is most commonly spoke and understood. It's again absolutely nonsense about BJP/RSS. In the recent times the only party that worked for the poor/lower castes/class is BJP. Most regional parties works for the dominant upper castes and congress further discriminated against the dalits/STs during UPA2 with their legislations (and always been the party of upper castes + muslims and christians).


> Hindi was promoted because of the need for a common language for a nation with 300+ official languages (not counting the dialects) - as it's the language that is most commonly spoke and understood

I thought that English actually served that purpose even in India (unofficially).


un officially yes, but as you can imagine, it's not an Indian language, and it's the language of our colonizer. It's more acceptable now, but back in the day, it was an absolute no no. Besides, common men did not speak English. Just the very highly educated (back then. Now, it's much more common among all classes).


For non-hindi speaking states both hindi and english are the same. At least with english it give jobs and enriches lives.


> It makes convenient to promote the varna system(modern caste system)

Not sure how Hindi is related to the caste system. Caste system plagues all of India irrespective of the mother tongue.

> forces in to the education of all non-hindi speaking states too

Education is a state subject in India, the union government cannot force a language on a state


> Education is a state subject in India

This is not true anymore. Education used to be a state subject but it is now in concurrent list; which means the union govt. has a say in education. That's how the union govt. was able to mandate three language formula in New Education Policy 2020.

[1] https://en.wikipedia.org/wiki/Concurrent_List


> There is a widespread misconception about India's National language as being Hindi.

The same misconception is present with national sport (believed to be Hockey, actually is no sport)


Hockey? Really?

I would surely have guessed Cricket.


Hockey was more popular in India in last century than it is now. It won Olympic medals so people started calling it 'National Sport'. Then cricket's popularity blew up (it was popular, it just increased a lot).

A fun question people ask here in India is 'What is the national sport of India?'. Most people reply cricket, few 'intellectuals' say hockey, but there is no national sport.


I think that since international cricket is more popular than international hockey, non-Indians know a lot about Indian cricket than hockey.


Hmmm, I'm from South-East Europe, and the only times I remember hearing a lot about cricket is from Lewis Carroll, and from Indian news. I'm not how many people in Europe, North America, South America, Africa, China, etc. know that cricket exists. They don't care much about hockey either (the grass one), so you might be right, but only because 0.0001% is larger than 0.00001% :)


> I'm not how many people in Europe, North America, South America, Africa, China, etc. know that cricket exists.

Lots of people in the places that were part of the British Empire other than in continental North America (where for some reason it had less staying power) which includes all of those places except South America (if you count the West Indies as part of North America.) It's popular some other places, too.


Tamil isn't the oldest language either. That would be one of the Austro-asiatic languages of the Andamans.


> The politics behind promoting Hindi is a religious one. It makes convenient to promote the varna system(modern caste system)

You lost me there. Usual white man's burden speaking. BJP is not a "religious" party in any sense of the word. Hindi is just ridiculously popular language mostly because of Bollywood. Since most politicians tend to be 'hindi' speaking they try to make Hindi even more popular as it benefits them where as regional parties see that as threat.

Stop drinking Equality Lab's kool aid.


The article takes as a given that preserving language is a good thing and it is a tragedy when a language dies.

I say that languages divide us and silo us. It is a triumph when a language dies because the people who previously spoke it now think and speak in a different language. Of course we should try to preserve any literature or folk-lore.

I look forward to the day when everyone on earth will be able to speak and communicate in a single language.


Different languages have different nuances for things so there are ideas and ways of thinking that are lost when a language dies, as you may have accidentally alluded to. I grew up in Colorado and my family normalized many terms like couloir or scree that got me in a bit of temporary trouble with an English instructor in Texas who thought I'd been fishing for exotic words to use in a paper. He was a great guy and accepted my explanation that these were words I used frequently while growing up: they are my language.


I say why not have an environment for people to have the opportunity to learn multiple languages so that the inherent beauty of the variety of languages is preserved, and yet, we can communicate and silo. As a matter of fact, then multiple languages could actually come to bond us together.

Goethe said something along the lines- “He who does not know foreign languages does not know anything about his own”. As for the inherent beauty in a language, most would agree that certain thing expressed in language X 'tastes' different than in Y. As for uniting, I am neither German nor Persian, but it is my ability to use German or Persian that unites me to both the cultures (and people) in a much deeper way; of course, you can argue that the fact that I drew the boundaries of German/Persian is in itself a problem, and having had a single language would have been the ideal, but again, I doubt that since language X might be superior in terms of certain expressions over Y (until proven otherwise).


On a brief look, Esperanto seems to be mainly Western-influenced.


> I say that languages divide us and silo us.

So should we just have one programming language? Should we have just one OS? Should we have one computer game? One social media? Should we have one political party? One school? One store? One book?

> I look forward to the day when everyone on earth will be able to speak and communicate in a single language.

We already do. It's called math.


Isn’t Facebook evidence enough that “connecting everyone” isn’t necessarily a good thing?


I don't see how that argument applies to language but not culture.




Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: