So... even if it's English-only, I can't quickly refer to a price in £ or € or ¥? Or that it's 90°F out? I can't talk about my fiancée? I can't even use “proper quotes”?
Yikes.
These days, talking about the "benefits" of ASCII-only is like talking about the "benefits" of HTTP over HTTPS, or the benefits of dial-up as opposed to broadband.
No thanks. It's exclusionary and for zero good technical reason. For every reasonable programming language, there are functions to do everything in UTF-8 that you can do in ASCII, with about the same complexity.
In 2020, why are you going out of your way to prevent people from typing the honest-to-goodness useful characters they want to communicate with? In English?! Where exactly does UTF-8 not work, that this site does?
I don't think it was a technical decision. It sounds like a stylistic choice, I suspect in the sentiment of "everyone on the platform should be able to understand each other, no one should be locked off from interesting conversations due to a language barrier", and from there they selected the lowest common denominator language.
To play devil's advocate, I don't think your exampled limitations are meaningful handicaps. Money can be discussed as GBP or EUR. With context it's easy to understand 90F as temperature, and unicode quotes are hardly a compelling restriction.
MacBooks sold in the UK have a £ sign on them, where a # is found on a US keyboard.
So in the country the English language originated in, you're not allowed to use a character on the default keyboard. (And that's just one -- the € is on there too, among others.)
We're not talking obscure key combos, we're talking characters in use everywhere you go every day. Would you really appreciate not being able to use the $ character on a forum? Every time you type a comment about a price, a key on your keyboard doesn't work, or the comment is rejected for an invalid character, or the character is silently deleted? Forget about it.
> I can't quickly refer to a price in £ or € or ¥? Or that it's 90°F out? I can't talk about my fiancée? I can't even use “proper quotes”?
Those are all totally argument-for-the-sake-of-argument nit-picks, in my opinion.
To be honest, I rarely enough need to refer to UK pounds, Euro, or Yen - that I'd much more likely just type it out like that instead of try to remember/experiment what the magic key combination to produce them properly is. There is no a single English speaker in the world that wouldn't understand "fiancee" - even with non "proper" quotes around it.
> In 2020, why are you going out of your way to prevent people from ...
That argument seems to me to inevitable end with some combination of 4chan-ish image boards (because "people want to communicate in memes") or Snapchat-ish 100% video content (because "people want to communicate in video"). If you want those things - go use those things.
The OP is trying something different. You call it "exclusionary", and claim it's a decision based on zero good technical reasons. I suspect it's better described as "intentionally constrained" and a social experiment.
(And thinking bigger picture, there are totally "benefits" to using http and dial-up or slower bandwidth in some places. I have a pile of ESP32/LoRaWAN boards sitting here. If I want to maximise the range the LoRa radios can achieve, I'm looking at 27kbps or so throughput. If I want to use these on 433MHz where I live, I can't legally operate them at more than 1% duty cycle (due to .au restriction on the 433MHz ISM band and digital transmissions). I don't have enough CPU to do TLS, I don't have much more than about 255 bits per second of bandwidth. I could _probably_ use the OP's software under those limitations, and build a mesh based messaging network that'd work for anyone within ~ 3-20km of you. Totally niche application, but also possibly useful. Imagine a pocket sized battery powered non cellular non wifi multi mile range comms device that'll mesh with the ones your friends are carrying - at a festival where cellular is crushed or not available. Text messaging style social network gadgets for your camp at Burningman, for less then $40 per user? I'd build that...)
> To be honest, I rarely enough need to refer to UK pounds, Euro, or Yen - that I'd much more likely just type it out like that instead of try to remember/experiment what the magic key combination to produce them properly is.
The description says 'english only', not 'US only'. There are quite a few english speaking people all over the world. Not to mention that Britain, which I'd consider pretty english, definitely needs the pound sign quite often. And European keyboards produce the €-sign with ease, I even have it explicitly marked.
I can see your point about intensional constraints and social experiments, but I fully disagree with your point that a lack of UTF-8 is a 'nit-pick' or that is is unneeded for english conversation. You can work around it, sure, but that does not make it great.
> That argument seems to me to inevitable end with some combination of 4chan-ish image boards (because "people want to communicate in memes") or Snapchat-ish 100% video content (because "people want to communicate in video"). If you want those things - go use those things.
Including images or videos is a feature with quite some overhead, while romanizing UTF-8 input is putting in extra work just to disallow it. So this goes in quite a different direction.
It is, as you and everybody you know seems to have worked out, not really worth the effort, unless you are amused in just the right way by such things.
(I did start the French novel influenced by it, but my decades-rusty high school French comprehension was not up to the task...)
I agree with the overall sentiment of your comment but..
> It's exclusionary and for zero good technical reason. For every reasonable programming language, there are functions to do everything in UTF-8 that you can do in ASCII, with about the same complexity.
At minimum, unicode does come with a lot of complexity in implementation, and a lot of potential vulnerabilities due to that complexity.
Not saying that excuses all websites, modern DBs handle unicode just fine after all, and the browser takes care of the rendering - (although some might crash your computer), and there are usually builtins for serverside web languages to help you sanitise strings... still, things would be way simpler with any fixed word <8bit encoding without glyph manipulation builtin. So yeah, it's complex, but it's usually worth it.
Unicode includes a lot of features that aren't universally desirable in applications that handle text. "Zalgo" text (many stacked combining subscript and superscript characters) can break out of the box it's contained in. Emoji characters render as distracting full-color icons that are easily confused for UI. Directional overrides let you plop an "everything after this should be printed backwards" control character into any string you control, providing endless opportunities to break UI and confuse other users, as in https://blog.malwarebytes.com/cybercrime/2014/01/the-rtlo-me...
IMO websites like this that challenge the assumption that all modern software is obligated to support all of Unicode are a valuable contribution to the world, if only because they might inspire the creation of a better, more well-scoped character set.
> IMO websites like this that challenge the assumption that all modern software is obligated to support all of Unicode are a valuable contribution to the world, if only because they might inspire the creation of a better, more well-scoped character set.
Oh God no.
We do not need any more character sets.
Most of what you mention can be fixed with a font, and the LTR-override stuff is in the realm of proper escaping which all web apps everywhere in the world need to do at all times absolutely goddamned regardless.
It's genuinely pretty tricky. There seems to be a valid use case in filenames if it needs to include both an English and Arabic or Hebrew component, for example. Which, if you work with bilingual documents between countries, is not unusual.
It's part of a broader question of visual strings having multiple unexpected Unicode representations, including Cyrillic letters masquerading as Latin, etc.
I've never heard of any solution, except for when a text component is supposed to be machine-interpretable (e.g. a file extension or domain), for the computer to display its interpretation in a special way (via icon, showing the extracted extension below the name, bolding the domain, etc.).
They're not great solutions, but there don't seem to really be any alternatives either.
I don’t know if you use Android based phones. It is common for these devices to practically never get updates. I think there are a lot of users stuck on older versions of Unicode?
> Using the term "extended ASCII" on its own is sometimes criticized, because it can be mistakenly interpreted to mean that the ASCII standard has been updated to include more than 128 characters or that the term unambiguously identifies a single encoding, neither of which is the case.
totally agree with you. i went back and saw that the author wanted a english-only forum. so yes, the service is english only and ascii works well in that case.
Yikes.
These days, talking about the "benefits" of ASCII-only is like talking about the "benefits" of HTTP over HTTPS, or the benefits of dial-up as opposed to broadband.
No thanks. It's exclusionary and for zero good technical reason. For every reasonable programming language, there are functions to do everything in UTF-8 that you can do in ASCII, with about the same complexity.
In 2020, why are you going out of your way to prevent people from typing the honest-to-goodness useful characters they want to communicate with? In English?! Where exactly does UTF-8 not work, that this site does?