Re: Undefined behaviour [was Re: The D Programming Language]
 
"Andrei Alexandrescu (See Website For Email)"
<SeeWebsiteForEmail@erdani.org> writes:
David Abrahams wrote:
"James Kanze" <james.kanze@gmail.com> writes:
Ian McCulloch wrote:
Right - and there are systems that already do this.  Valgrind (
http://www.valgrind.org/) springs to mind at this point.  In the face of a
programming error, you want as much `undefined' behaviour as possible, to
give the tools that detect such behaviour the most information possible.
Except that you only need such tools because of the undefined
behavior.  
Completely backwards.  You can only _use_ such tools because of the
undefined behavior.  You still need something to detect the incorrect
logic that in C++ would have caused UB and in Java causes
who-knows-what "defined" behavior.  But no such tool exists, or can
exist.
I'm not sure I figure the logic. Java statically disallows a number
of programs that in C++ are allowed and would be correct. It also
disallows statically a number of programs that in C++ are allowed,
and that are not correct. 
And there are plenty of programs with incorrect logic that is allowed
by both languages.  In C++ that incorrect logic may lead to undefined
behavior, which is detectable by tools at runtime.  In Java it cannot.
Now, for a _subset_ of the latter (incorrect) programs, there are
tools to help.
What kind of tools?  You can do static analysis, like Lint does, and
look for "likely problems," but that's available to C++ as well.  Once
you've done all you can do statically, there's nothing you can do to
detect errors at runtime in a system where there are no illegal
operations.
This is my understanding of the situation. From these facts, I fail
to draw more interesting conclusions than (1) there are programs
accessible to C++ that aren't accessible to Java, and (2) C++ is
more dangerous than Java.
That's probably all true.
Limit the cases of undefined behavior to the few that show up in
Java, and you don't need valgrind.  Or rather, it doesn't help you,
That's more on target.  It can't help you.
It can't help you in the same way a kevlar glove won't help you shoot a
gun that won't explode in your hand. 
So to avoid explosions in your hand when you operate the gun
incorrectly, the manufacturer added an exhaust port that allows hot
gases to escape backwards and burn you in the face.  Okay, maybe
that's overstating the analogy: maybe instead the bullet leaves the
gun with such low velocity that it's ineffective against the attacking
polar bear.  The point is that the same kind of incorrect program
logic just leads to different negative effects.  If you want to call
the C++ effects "unsafe" and the Java effects "safe," well I agree
from a type-system point of view but from a practical P.O.V. that
doesn't matter.  Logic errors can still cost lives.
Maybe the safe gun is less powerful, but I don't know how you can
use the glove as an argument to prove anything. 
Oh, wait: tools like Valgrind aren't like a glove; they're more like
nerve endings that allow you to experience pain the moment something
goes wrong.  If all the nerves in your arm are dead and you misuse a
bandsaw, chances are better that you'll slice the entire arm off
before you notice the blood on the workbench.  How's that for an
analogy? :)
[Actually I had a shop teacher in high school who had lost all feeling
in his arm.  He used to freak kids out about tool safety by hitting
his thumb with a hammer, or so I'm told. <shiver> ]
Right.  The question is, does the elimination of UB (which, remember,
is a *response* to programming errors, not a cause) actually in and of
itself make it harder to make programming errors?  I don't see how it
could.
I think I can answer that one.
Fewer programs are allowed. 
I think that's a separate issue from whether there is UB in the
language.
So if the language designer took the right turns, many "wrong"
programs would be statically eliminated, and few "good" programs. So
by simple set theory, we could infer under these assumptions that
fewer programming errors will make it.
Fine, let's start with this language that allows fewer programming
errors as a baseline.  Do you think adding ASSERTs to a program can
help you find and eliminate program bugs?  Keep that question in the
back of your mind.
Now the language designer has a choice, for a category of operations
that are allowed statically but on which -- based on dynamic factors
such as whether two pointers refer to the same object or not -- it's
difficult to pin obvious and useful semantics, whether to choose one
semantic result arbitrarily and avoid UB, or just call that operation
"illegal."  That's essentially what you're doing when you write an
ASSERT: labelling certain operations illegal.
My claim is that avoiding UB by forcing one predictable result in this
case does not help debuggability, because the language (or tools) then
have no license to do that runtime detection automatically and insert
the language's own ASSERTs.  Let's just look at array bounds checking
for example.  In Java, putting a try/catch around a loop over elements
of an array without any termination check is a completely legit thing
to do.  They avoided UB in this case by picking a defined behavior.
A language that calls that operation illegal can do something more
useful for debugging when array bounds are violated.
I think of it this way: let's start from an entirely safe language (no
UB). It allows a set of correct programs and another set of incorrect
programs to be written.
Now, say I start to add features exhibiting UB. Then with each feature I
create the opportunity for more programs to be written. Some of those
programs will be correct (and not expressible in the safe language), and
some of them will be incorrect (and also not expressible in the safe
language).
OK so far.
If by adding a feature with possible UB, you may increase the set of
incorrect programs more than increasing the set of correct programs.
Or vice-versa.  Actually I don't understand the sentence; it starts
with "if" but then fails to give a conclusion.  Anyway, I think this
is where your logic breaks down.  Adding a feature with possible UB,
like array indexing, does not necessarily have an effect on the
balance of possible incorrect-vs-correct programs.
-- 
Dave Abrahams
Boost Consulting
www.boost-consulting.com
      [ See http://www.gotw.ca/resources/clcm.htm for info about ]
      [ comp.lang.c++.moderated.    First time posters: Do this! ]