> So in this sense, the quality of the C you write is really a reflection of you...

_wugy · on Nov 28, 2016

But don't the security vulnerabilities come from poorly implemented code? These vulnerabilities are not inherent to C.

pcwalton · on Nov 28, 2016

In what commonly used language other than C and C derivatives do you regularly see use-after-free leading to remote code execution?

sidlls · on Nov 28, 2016

C makes it trivial to implement poorly, though.

(Note: I'm playing devil's advocate here to some extent. My view is that safety is important, but lack of provable safety is not some terrible Demogorgon that we should hide in fear from. I think a lot of the concern over safety is valid, but in some contexts it's just overhyped.)

junke · on Nov 29, 2016

My view is that lack of provable safety should be resolved by defensive code (runtime checks). And then, you are safe (if safety is important in your code, which probably should by default in a professional setting).

sidlls · on Dec 1, 2016

I agree, it is solvable by defensive code. The vast majority of the time that code is perfectly sufficient. The number of people who don't die when the hundreds of thousands of things that don't go wrong when an embedded C-program doesn't crash or blow apart because of memory safety bugs daily demonstrates this. I don't think people understand just how much of our world is run, quite literally, by "not provably safe" code. It's not just C and C++, either.

Which is one reason why I don't buy the "memory safety" argument as a very strong one for adopting Rust. There are other much better reasons to do so for a certain class of programming, in my opinion.

junke · on Nov 28, 2016

Vulnerabilities like buffer overflow do not happen in languages with a string type. Humans are responsible if something bad happens, but without a safety net, the outcome is worse.

ArkyBeagle · on Nov 28, 2016

C has a perfectly useable (null-terminated) string type, and there is no good reason to ever have a buffer overrun in C.

I understand that this is... obscure for some reason and I'm not saying it never happens, but let's be realistic....

Retra · on Nov 29, 2016

C has a char* type, which we call a string, but it is also the type of a pointer to a single char, which is not a string at all, and also something perfectly usable. "Ends with nul" is barely a part of C, it's more like a programmer's agreement. The language doesn't enforce it, require it, or check it. All it does is insert nul characters in literals, which is hardly enough to make a string type.

Thus if you have a to_upper(char*) function, you don't know what it takes or does without looking it up. Does it uppercase a single character or a whole string? How do you even tell what you were passed without potentially reading past the end of a buffer?

If I happen to have a pointer-to-char and pass it to a to_upper function that operates on strings, it will just write on invalid memory, because C can't distinguish between the two.

spc476 · on Nov 29, 2016

From the signature, I would say it expects a NUL-terminated sequence of characters (a C-string) and it would modify it in-place to upper case each character. C already has a standard C function:

    extern int toupper(int);

(via #include <ctypes.h>) that will upper case a single character. If, on the other hand, I saw:

    extern char *to_upper(const char *);

I would expect that to_upper() returns a new string (freeable via a call to free()) that is the upper case version of the given string.

> If I happen to have a pointer-to-char and pass it to a to_upper function that operates on strings, it will just write on invalid memory, because C can't distinguish between the two.

Um ... how do you "happen" to have a pointer-to-char? And unknowingly call to_upper()? I'm lost as to how this can happen ...

Retra · on Nov 29, 2016

The signature doesn't tell you that. If my API said

    int frobnicate(char*)

and you make that kind of assumption, then your code may or may not work, depending on what the function does internally. You simply do not know whether I am operating on null-terminated char sequences or a single char.

>Um ... how do you "happen" to have a pointer-to-char?

    char* text = "some text";
    char* c = text[2]

There you go.

>And unknowingly call to_upper()?

Who said anything about unknowingly calling a function? It's "toupper", not "string_to_upper" or "char_to_upper". The function signature simply doesn't tell you what the function requires of its input.

PS: char* is also a pointer-to-byte in C.

spc476 · on Nov 30, 2016

Your response to me shows you don't program in C all that much. I ran your code example through a C compiler and got:

    a.c:2: warning: initialization makes pointer from integer without a cast
    a.c:2: error: initializer element is not constant

What you really want is:

    char * text = "some text";
    char * c    = &text[2];

which still doesn't prove your point because c is still pointing to a NUL-terminated string.

If fronnicate() really takes a single character, I might ask why the function requires a pointer to char for a single character instead of:

    int frobnicate(char);

but if you are going to really argue that point, so be it. Discard the fact that in idiomatic C, a char * is generally considered a NUL-terminated string (and here I'm talking ANSI C and not pre-ANSI C where char * was used where void * is used today).

You are also shifting the argument, because in your original comment I replied to, the function you gave was to_upper(). toupper() is an existing C function.

P.S. char * is a pointer-to-character, not a "pointer-to-byte", pedantically speaking. All the C standard says is that a 'char' is, at minimum, 8 bits in size. It can be larger. Yes, there are current systems that this is true.

Retra · on Dec 2, 2016

A single typo doesn't tell you anything about my programming habits.

>which still doesn't prove your point because c is still pointing to a NUL-terminated string.

No, it's pointing at a char that happens to be part of a nul-terminated string. The semantic intent of that distinction is entirely lost because C fails to make a distinction. I could easily overwrite that nul, and it would no longer be the case. Then it's suddenly an array of chars, and everything pointing at it is now a new type of thing.

char* s = (char*) rand();

This also will point at a 'nul terminated string' with very high probability. Doesn't mean it is safe to call string functions on it...

>I might ask why the function requires a pointer to char for a single character instead of int frobnicate(char)

You could say the same about any pointer argument. Obviously pointers are useful for a reason. If frobnicate returned a char, I would just end up dereferencing a pointer to stick it back in the string it came from. Whether that is frobnicate's job or it's caller's job is a matter of API design, and should not be determined by C, especially when it makes no preference for any other kind of pointer.

>You are also shifting the argument, because in your original comment I replied to, the function you gave was to_upper

My arbitrary example function name doesn't matter one iota. Get over it, and stop being needlessly dense.

ArkyBeagle · on Nov 29, 2016

This is all true.

So don't do that.

Retra · on Nov 29, 2016

Don't worry about me, I never make any mistakes. I'm a true C programmer: I believe that "implement a good string type" is an unsolved problem and that the last 50 years never happened.

sidlls · on Nov 28, 2016

Your first statement is pretty false, even in Rust (for example). Unless you mean something else by "buffer overflow" than I'm accustomed to.

junke · on Nov 28, 2016

You are right, "do not happen" sounds too much like "will never happen". See also Wikipedia's entry about that example[0]. My point is that if the programmer can't prove accesses are always within appropriate bounds, there should be a runtime check. That is simple. This is not "slow" (and even in the case it you need it fast and are ok to randomly crash, avoiding checks should be explicit). And some languages do it by default and make it really hard to mess with memory.

[0] https://en.wikipedia.org/wiki/Buffer_overflow#Choice_of_prog...

sidlls · on Nov 28, 2016

Well, yes, I agree in general bounds should be checked at runtime when it isn't possible to statically verify access at compile time.

I'm not sure how default access in C or C++ isn't explicitly avoiding checks. By definition "a[b]" is an unchecked dereference. It doesn't get more explicit than "by definition." Of course if by "explicit" you mean "syntax exists that demarcates unchecked access" then C and C++ will never satisfy. I'd argue that's a contrived and artificially narrow use of "explicit" meant, er, explicitly to exclude C and C++ from being acceptable by definition and therefore not terribly fair.

junke · on Nov 28, 2016

I mean something like Ada's pragmas: https://en.wikibooks.org/wiki/Ada_Programming/Pragmas/Suppre...

sidlls · on Nov 28, 2016

Yes (Rust's "unsafe" blocks serve the same purpose), and my point is you're narrowing the definition of "explicit" to exclude C or C++ by definition. And that isn't exactly a fair, in my view.

junke · on Nov 29, 2016

There is no doubt that C, by definition, opts out from performing bound checkings. But if bounds were always checked by default (implicitly), then you would have to opt-out explicitly, which is a safer approach, because all else being equal, in case of a programming mistake, the code ends up not being vulnerable to that specific kind of attack.