[Python-Dev] Re: PEP 285: Adding a bool type (original) (raw)

Laura Creighton lac@strakt.com
Tue, 02 Apr 2002 15:40:46 +0200


Dear Guido:

I would first like to thank you for giving us an opportunity to respond. I have spent most of the weekend thinking and writing a reply to this, and I think that this has made me a better teacher. For this I am grateful. I realise that this is rather long, but I have condensed it substantially from my efforts earlier this weekend. Thank you for taking the time to read this.

Laura Creighton


I am opposed to the addition of the new proposed type on the grounds that it will make Python harder to teach both to people who have never programmed before and to people who have. If they have no preconceived idea of Booleans, then I do not propose to need to teach it to them in an introductory lesson. There is a time to learn symbolic logic but while trying to learn how to program for the first time is not it. If, on the other hand, they do have some preconceived idea of Booleans then Python will not have what they want. They will want stricter booleans that 'behave properly'.

Maybe some day we should give these people such a type. People who do symbolic logic and make push-down automatons all day long will love it. If we implement this, not out of objects, but out of bits, and have something really sparse in memory consumption, we will cause jubilation in the NumPy community as well. But I don't want to discuss this here, I only want to discuss the new type which is proposed in this PEP. And I believe that this half-way thing, int-in-hat, that is called bool but which does not implement what a mathematician would call truth values will make it much harder for me to teach the Python language and programming in general. Not only do I not need, but I actively do not want this change which is the integer 0 and the integer 1 that have some odd printing properties.

    1) Should this PEP be accepted at all.

No.

    2) Should str(True) return "True" or "1": "1" might reduce
       backwards compatibility problems, but looks strange to me.
       (repr(True) would always return "True".)

Now we have a teaching problem. If str(True) returns anything but 'True' then I am going to have to explain this to newbies, really early on. I can't see myself claiming that 1 is the string representation of True. I can see myself explaining that int(True) is 1, or that bool(1) is True. If I say that the string representation of True is 1, then I must assert that True is just a fancier, prettier way of writing 1.

But this breaks the more common practice where str is used for the prettier way of writing things and repr is used for the uglier one. And I guarantee my students will notice this, especially if they have heard me explain why their floating point numbers are not printed the way that they expect. I don't want to have to explain that eval(repr(object)) is supposed to generate the object whenever possible, for the last thing I want newbies to be thinking about is eval(). I guess I am going to have to say that it is a wart on the language, and that we have it this way so as not to break too much exisiting code.

I think this wart is far uglier than the lack of a half-way-but-not- quite Boolean Truth value. But I am all for this wart rather than break the exisiting code.

    3) Should the constants be called 'True' and 'False'
       (corresponding to None) or 'true' and 'false' (as in C++, Java
       and C99).

I would rather the constants, if we have them, be called anything other than True or False. It is not the things that you don't know that hurt you the most in learning a language -- it is 'the things that you know that ain't so'. Learning C, from PL/1, my first problem was 'how do I write a procedure'? You see, I aleady knew that by definition functions returned values and procedures didn't, and so, since I didn't want to return a value, I didn't want to write a function. Learning that your basic defintions are wrong generally requires a poke from the outside. The same thing happened to me when learning Python. I knew, by definition, that attributes were not callable. Using getattr() to test whether a class had a method did not occur to me, despite pouring over python docs for 2 days looking for such a thing. Finally I went after it out of the class -- and immediately posted something to python-list, saying 'This is ugly as sin. What do real people do?' I never got around to questioning whether 'attributes by definition, may not be a callable' in precisely the same way I never got around to questioning whether 'you could write a function that did not return a value'.

The problem is that everybody has some conceptual understanding of True and False. And whatever that understanding is, it is unlikely to be the one used by Python. Python does not distinguish between True and False -- Python makes the distinction between something and nothing. And I think that this is one of the reasons why well-written Python programs are so elegant. And this is what I am trying to teach people.

So I out-and-out tell people this. {} is a dictionary-shaped nothing. [] is a list-shaped nothing. 0 is an integer-shaped nothing. 0.0 is a float shaped nothing.

I want to save them from the error of writing

if (bool(myDict) == True):

and if they start out believing that python only distinguishes between Something and Nothing, they mostly are ok. And to rub this point in you can do this:

False = 1 True = 0 if False:

... print 'Surprise!' ... Surprise!

if True:

... print 'I am True' ... else: ... print 'Surprise Again!' ... Surprise Again!

This is a very nice eye-opener. It is a true joy. Watch the minds go pop! and the preconceived notions disappear. (You then let them know exactly what you will think of them if they ever do this 'for real', of course.)

    4) Should we strive to eliminate non-Boolean operations on bools
       in the future, through suitable warnings, so that e.g. True+1
       would eventually (e.g. in Python 3000 be illegal).  Personally,
       I think we shouldn't; 28+isleap(y) seems totally reasonable to
       me.

This is not an argument for allowing non-Boolean operations on bools(); this is an argument for not writing functions that return Booleans. Make them return numbers instead, so that you can use them as you did. Last month we discussed why in 1712 February had 30 days in Sweden. (See:

http://groups.google.com/groups?q=leap+year+Sweden+group: comp.lang.python.*&hl=en&selm=Xns91D08815184B2svenaxelssonbokochwe% 40212.37.1.234&rnum=1

if you care, and you missed it.)

I live in Sweden. Assigning students the problem of calculating whether or not a given year is a leap year in Sweden appeals to me.

But I know students. I guarantee that I will get an isleap that returns a True or a False under the proposed new regime. And this is precisely what I do not want, which I will try to teach by assigning, next week, a program that calculates how many days are in a given year. I predict that I will get a lot of answers like this:

if year == 1712: days = 367 elif isSwedishLeap(year): days = 366 else: days = 365

There are many things wrong with this code. The bizzare special case of 1712 belongs in the SwedishLeap function, with the rest of the weirdnesses. Thus I will have to convince my students that it is better to write a function that does not return True or False. And this is despite the fact that I originally asked for 'is Y a leap year or not'.

This will be a necessary exercise. If True and False are in the language, I am going to have to work especially hard to teach my students that you (mostly) shouldn't use the fool things. There is almost always some value that you would like to return instead.

Having renamed SwedishLeap and fixed it to return how many leap days, instead of a bool, I have now made a different problem for myself and my students.

The new improved solution for 'is this year a leap year' will be:

if bool(SwedishLeap(year)) == True:
# the better students will say 'is' instead of '==' print 'yes.' else: print 'no.'

Aaargh! I already see too much code like this. It's mostly written by people who come from other languages. They define their own True and False so they can do this. (And they mostly have an extra set of () as well). Right now I have the perfect fix for this. I just say 'Python does not care about True and False, only nothing or something'. You have just stolen my great weapon.

What am I going to say?

attempt 1.

Python pretends to have bools, but they are just ints in fancy hats. So you are making more comparisons than are necessary.

smart student:

But you said it is better to be explicit than implicit! And here I am explicitly performing the type coercion rather than let it happen implicitly! (or PyChecker warned me that I was making an implicit conversion!) I put in the cast so that people will know exactly what is happening!

attempt 2.

But they are really ints 'under the hood'. I was not kidding about the fancy hats! if SwedishLeap(year): is precisely what you want. You don't want to test against True at all!

smart student:

Comparisons yield boolean values. Therefore they want a Boolean value. You are just being lazy because it saves typing. In the bad old days before we had Booleans this was ok, but now that we have them we should use them! Otherwise what good are they? What should I be using them for if not for this?

attempt 3.

Got an hour? I'd like to explain signature-based polymorphism to you ...

smart student:

ha! ha! ha!

    5) Should operator.truth(x) return an int or a bool.  Tim Peters
       believes it should return an int because it's been documented
       as such.  I think it should return a bool; most other standard
       predicates (e.g. issubtype()) have also been documented as
       returning 0 or 1, and it's obvious that we want to change those
       to return a bool.

I think that operator.truth(x) should return an int because about the only thing I use it for is operator.truth(myDict). I really want the integer value. How do you propose I get it if you change things? Why make me go through the gyrations of int(bool(myDict))?

By the way, look at the list of those things that would be changed to return a bool. Most of them are python implementations of ANCIENT C functions. They date from the time before we invented exceptions! The primitive old days when we had to test for every possible time you wouldn't want to run your code before you actually got to run it.

I don't want to go back to the days of

if aflag or (bflag and (cflag or dflag)):

either.

And all Truth testing (unless you are doing symbolic logic) reeks this way to me, of the old style I am trying to stamp out. Truth testing is just another form of type testing, and just as ugly.

Rationale

    Most languages eventually grow a Boolean type; even C99 (the new
    and improved C standard, not yet widely adopted) has one.

    Many programmers apparently feel the need for a Boolean type; most
    Python documentation contains a bit of an apology for the absence
    of a Boolean type.

So fix the docs, don't change the code! . I think the fact that in python control flow structures distinguish between Something and Nothing is one of the beauties and glories of the language, and you should delete any documentation that says otherwise.

Under the proposed new scheme you will have to trade apologies for the lack of bools, for apologies for not producing real bools, only this int-in-a-new-hat hack that pretends to be a bool. This is hardly progress.

    I've seen lots of modules that defined
    constants "False=0" and "True=1" (or similar) at the top and used
    those.  The problem with this is that everybody does it
    differently.  For example, should you use "FALSE", "false",
    "False", "F" or even "f"?  And should false be the value zero or
    None, or perhaps a truth value of a different type that will print
    as "true" or "false"?  Adding a standard bool type to the language
    resolves those issues.

So would adding True and False to the builtins, and probably operator.truth as well, and then modifying PEP 8 saying to use the things if you actually have a need for True and False. Then you could also get a much needed word in edgewise discouraging

if bool(x) == True:

or actually using True and False much, because there is usually a better more pythonic way to do what people used to other languages are accustomed to doing with booleans. This is precisely what some people have said here: 'When I started using Python, I made True and False, but once I stopped trying to program in some other language using python, I stopped needing these things'. (see the recent post by Don Garrett <garrett@bgb-consulting.com> in this thread for an example.)

This is what I have observed as well. And I fear if you add these new types to the langauge people will never take this step. The existence of the types in the language will discourage them from thinking that using True and False all the time is not pythonic. It is nice that people are puzzled, wondering how all those python programmers live without a boolean type. Eventually they puzzle it out. This is not a bug, but a feature .

    Some external libraries (like databases and RPC packages) need to
    be able to distinguish between Boolean and integral values, and
    while it's usually possible to craft a solution, it would be
    easier if the language offered a standard Boolean type.

I'm one of the people who build interfaces to databases that need to distinguish this. For what it's worth, can you please not add this feature to the language? Don't do it for me ....

    The standard bool type can also serve as a way to force a value to
    be interpreted as a Boolean, which can be used to normalize
    Boolean values.  Writing bool(x) is much clearer than "not not x"
    and much more concise than

        if x:
            return 1
        else:
            return 0

Conciseness for its own sake is no virtue.

    Here are some arguments derived from teaching Python.  When
    showing people comparison operators etc. in the interactive shell,

    I think this is a bit ugly:
        >>> a = 13
        >>> b = 12
        >>> a > b

        1

    If this was:
        >>> a > b

        True

    it would require one millisecond less thinking each time a 0 or 1
    was printed.

This is the basis of our disagreement. I think that it is very, very important that much more than a millisecond be spent on this. This is a fundamental python concept, which I want to teach. This is precisely where you learn that python distinguishes between Something and Nothing, and if you have a problem seeing why this implies a > b printing as 1, then you probably have a problem with the whole concept. And making it return True is precisely what I never, ever, ever want a python learner to see.

People who come from staticly declared languages have a terrible burden to overcome when they meet python. They are not used to the fact that everything is an object. They have rigid barriers in their heads between 'control statements' and 'data'. This is precisely where such conceptual barriers first begin to crumble. This is precisely the experience I want my students to have. It is precisely how I train people to give up their old ideas. You have just made my teaching job harder, not easier.

I've never had any trouble teaching anybody that if 1: means do it, and if 0: means don't. Ever. Who is it that has had such trouble? I fear they may be new to teaching and are confusing 'this is taking the students a while to learn because it is something new that they have never seen before' with 'this is taking the students a while to learn because the language is broken'. Learning that python distinguishes between Something and Nothing is a completely new idea that you are giving these people, outside of their experience. This is going to take a while to get used to. But it is going to take longer to get used to, if there is all this confusing stuff about symbolic logic, truth values, George Boole, and why the math majors in the class are all snickering and saying 'Python is a sucky language. Its implementation of booleans is so, so, lame...' thrown in as well.

If you already know what a boolean is, then chances are Python's bools are not going to behave the way you expect them to. If you don't know what a bool is, then I feel morally obligated to teach you the difference between real truth values and this int-in-a-hat kludge. In either case, I now have to spend a bunch of time teaching about booleans, something I had no desire to do before today. I want to teach that if 1: means do it and if 0: means don't. I want to teach that python makes a distinction between Something and Nothing. And I can teach that a lot faster than An Introduction to Boolean Algebra ...

    There's also the issue (which I've seen puzzling even experienced
    Pythonistas who had been away from the language for a while) 
    that if you see:

        >>> cmp(a, b)
        1
        >>> cmp(a, a)
        0

    you might be tempted to believe that cmp() also returned a truth
    value.  If ints are not (normally) used for Booleans results, this
    would stand out much more clearly as something completely
    different.

I don't think that people are confused about this because they think that cmp is returning True or False. I've had people go on believing that cmp should return one of 2 values even as I was telling them that it returned one of 3. The desire for cmp to be a two-valued comparison runs deep in many souls, and is tied to a passionate belief that comparisons are binary, not that they return True and False.

    I don't see this as a problem, and I don't want evolve the
    language in this direction either; I don't believe that a stricter
    interpretation of "Booleanness" makes the language any clearer.

I think that a really strict boolean might be nice to have. And if you ever write one, you will curse the day you let this hack be named bool. Replacing this int-with-a-hat with real bools will break so much code.

    Other languages (C99, C++, Java) name the constants "false" and
    "true", in all lowercase.  In Python, I prefer to stick with the
    example set by the existing built-in constants, which all use
    CapitalizedWords: None, Ellipsis, NotImplemented (as well as all
    built-in exceptions).  Python's built-in module uses all lowercase
    for functions and types only.  But I'm willing to consider the
    lowercase alternatives if enough people think it looks better.

I'd really like something other than True and False altogether. Something Nothing or Empty Full or Yin Yang, anything which people will not believe they already understand. It is so much easier to teach something which people know is brand new rather than something which is brand new but looks like something subtly different that people already believe they know.

I have this problem a lot. It is hard to teach people what floating point numbers are because their grade school teachers have taught them what fixed point decimals are so well (except for the name). If floating point numbers were traditionally printed xxx#yyy instead of xxx.yyy I do not think that I would have such difficulties. In any case the people who do not understand floating point would be aware that there is something that they do not understand, rather than blinding going off an using them to represent money.

I don't want to have to teach the pythonic meaning of True and False to people who already believe they know what True and False is. This is going to be hell. It is about as hard a thing as there is in teaching, getting around people's previous conceptions.

    It has been suggested that, in order to satisfy user expectations,
    for every x that is considered true in a Boolean context, the
    expression x == True should be true, and likewise if x is
    considered false, x == False should be true.  This is of course
    impossible; it would mean that e.g. 6 == True and 7 == True, from
    which one could infer 6 == 7.  Similarly, [] == False == None
    would be true, and one could infer [] == None, which is not the
    case.  I'm not sure where this suggestion came from; it was made
    several times during the first review period.  For truth testing
    of a value, one should use "if", e.g. "if x: print 'Yes'", not
    comparison to a truth value; "if x == True: print 'Yes'" is not
    only wrong, it is also strangely redundant.

This suggestion came from somebody's preconceived idea of 'what is True' and 'what does it mean for something to be True'. Now that you have posted this to python-list, you have found a whole lot more. Have you ever had such a response to a PEP? Everybody thinks that they know what True and False means, and that Python should do it his or her own way. This is what everybody's classroom looks like as well. And this is why sane teachers do not want to discuss the meaning of True if they can help it.

I want to keep python out of the True and False business. Python cares about whether a value is Something or Nothing. This is beautiful, and better than what the other languages do.

Once again, Thank you for reading this and giving me a chance to write up my objections. You have made me a better teacher because of this.

Laura Creighton