[Python-Dev] unicode Exception messages in py2.7 (original) (raw)

Chris Barker chris.barker at noaa.gov
Fri Nov 15 01:41:35 CET 2013


On Thu, Nov 14, 2013 at 3:58 PM, Steven D'Aprano <steve at pearwood.info> wrote:

It's not a given that the current behaviour is a bug.

I'll concede that it's not a bug unless someone said somewhere that unicode messages should work .. but that's kind of a semantic argument.

I have to say it's a very odd choice to me that it suppresses the message, rather than raising an encoding error, like what happens everywhere else the default encoding is used.

In fact, I noticed that the message can be anything that can be stringified, which makes it particularly wacky that you can't use a unicode object.

Exception messages in 2 are byte-strings, not Unicode.

well, they are anything that you can call str() on anyway...

Trying to use Unicode instead is not, as far as I can tell, supported behaviour.

clearly not

If the exception message cannot be converted to a byte-string, suppressing the display of the message seems like perfectly reasonable behaviour to me:

well, yes and no -- the fact is that unicode objects ARE special -- and it wouldn't hurt to treat them that way. And I'm not sure that suppressing the message when you've passed in a weird object that raises an exception when you try to convert it to a string makes sense either -- suppressing an exception is really not a good idea in general -- you really should have a good reason for it. I'm guessing that this was put in to save a lot of crashing from unicode objects, but what do I know?

Actually, when I think about it, Exceptions being raised when you call str(0 on something are probably pretty rare -- if you define a class with no str method, you get a default string version -- there can't be many use-cases where you want to make sure no one tries to make a string out of your object...

although it would be nice if a newline was used so the prompt was bumped to the next line.

yup -- that would be good.

The point is, I'm not convinced that this is a bug at all.

OK -- to clarify the discussion a bit:

I think we all agree that this is not a fatal bug that MUST be fixed.

Is this something that could be improved or is the current behavior the best we could have, given the limitations of strings an unicode in py2 anyway?

If it's not a desirable change, then we're done -- sorry for the noise.

If it is a desirable change, then is the benefit worth the possible breakage of code. Do assess that, you need to trade off the size of the benefit with the amount of breakage.

I think it would be a pretty nice benefit

I can't see that it would cause a lot of breakage.

Any idea how we could assess how much code or tests are out there in the would that this would affect?

I contend that it wouldn't be much because:

If I had thought to write a test for this, I would have thought to fix my code so that it would either never use a unicode object for a message, or, like I have done in my code, encode it when passing it in to the Exception.

There is certainly a chance that some doctests would break, if people had not looked carefully at them -- i.e. that wanted to test that the exception was raised, but did not notice that the message didn't get through.

How many are there? who knows?

-Chris

--

Christopher Barker, Ph.D. Oceanographer

Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception

Chris.Barker at noaa.gov



More information about the Python-Dev mailing list