msg59642 - (view) |
Author: Mark Dickinson (mark.dickinson) *  |
Date: 2008-01-10 01:01 |
After seeing issue #1761, I realized that there's a bug in the Decimal constructor: it accepts newline-terminated strings: >>> from decimal import * >>> s = "2.3\n" >>> Decimal(s) Decimal("2.3") I think this is, strictly speaking, a bug because: (1) The IBM decimal specification explicitly disallows additional whitespace in a numeric string (see http://www2.hursley.ibm.com/decimal/daconvs.html), (2) the operation to-number is supposed only to accept numeric strings, and (3) Decimal.__new__ is currently the method that implements to-number. Is this worth fixing? This buggy behaviour might well be useful (e.g. to someone parsing a file with one Decimal per line). I'll fix it if anyone thinks it's worth it. Even if it should be fixed, I don't think this is worth backporting to Python 2.5, especially since it might break things. |
|
|
msg59643 - (view) |
Author: Guido van Rossum (gvanrossum) *  |
Date: 2008-01-10 01:04 |
I'd propose doing the same thing as int() and float(), which accept and strip leading and trailing whitespace. |
|
|
msg59644 - (view) |
Author: Raymond Hettinger (rhettinger) *  |
Date: 2008-01-10 01:20 |
I concur with Guido. The to-number operation is part of the spec. The constructor belongs to us and should match the rest of the language. |
|
|
msg59646 - (view) |
Author: Mark Dickinson (mark.dickinson) *  |
Date: 2008-01-10 01:28 |
Okay: in that case, I propose: (1) adding a to_number method to the Decimal and Context class, so that we're still in full compliance with the specification, and (2) altering Decimal.__new__ to allow trailing and leading whitespace. |
|
|
msg59647 - (view) |
Author: Mark Dickinson (mark.dickinson) *  |
Date: 2008-01-10 01:50 |
Aargh! It's only the Context class that needs a to_number method---it makes no sense as a Decimal method. And to_number is almost already there, in the form of Context.create_decimal. I'll work out exactly what the minimum is that needs to be done to comply with the specification, and post a patch |
|
|
msg59654 - (view) |
Author: Mark Dickinson (mark.dickinson) *  |
Date: 2008-01-10 03:10 |
In the spirit of making the Decimal constructor match the rest of the language, can I propose another change: make Decimal("not a decimal") raise a ValueError. Currently, Decimal("not a decimal") either raises InvalidOperation or returns a Decimal NaN, depending on whether the InvalidOperation trap is set in the current context or not. This behaviour presumably again comes directly from the specification, but as pointed out already Decimal.__new__ doesn't need to rigidly follow the specification---we just need to make sure that the functionality of to-number is present somewhere (and create_decimal seems like a good candidate for that). It seems to me that: Decimal.__new__ shouldn't care about the current context. The fact that __new__ takes a context parameter at the moment is potentially confusing: one might reasonably expect that passing a context to __new__ would result in the Decimal being rounded to fit that context (as happens with create_decimal). In fact, the *only* reason the context argument is there because it's needed to raise InvalidOperation; if the first argument to __new__ is valid then the context is entirely unused. |
|
|
msg59655 - (view) |
Author: Raymond Hettinger (rhettinger) *  |
Date: 2008-01-10 03:29 |
I think the decimal exceptions should continue to be raised as-is. There is a well-defined and thought-out structure for the Decimal exceptions. Also, the constructor's exceptions tend to be raised in an environment where there are other decimal operations taking place -- it would be a shame to have to catch both types of exception. I spent about a month working on this module and put a decent amount of thought into integrating the API. I would prefer to not revisit all of those decisions one at a time. Instead, let's graft on any new magic methods and fix straight-out bugs. Having lots of little microscopic API changes will harm more than it will help. Why introduce incompatabilities for no gain. I'm inclined to dismiss this whole bug report. It doesn't help anything to muck with the existing choices about whitespace handling. Recommend closing this as yagni / don't-care. |
|
|
msg59660 - (view) |
Author: Facundo Batista (facundobatista) *  |
Date: 2008-01-10 12:09 |
I think that the Spec disallows additional whitespace to not allow, for example, "2. 34" as "2.34", or "10 e-12". I don't see any harm in having " 2.34" or "5.73\n" as good input values, as long we remove those characters at both ends. I propose to just make a .strip() at the input string, change the documentation to reflect this, and that's all. The doc says: "If value is a string, it should conform to the decimal numeric string syntax:" We can put: "If value is a string, it should conform to the decimal numeric string syntax after being stripped on both ends:" |
|
|
msg59687 - (view) |
Author: Mark Dickinson (mark.dickinson) *  |
Date: 2008-01-11 01:13 |
Allowing leading and trailing whitespace causes failures in Cowlishaw's test suite. I guess Raymond's right: this bug report should be dismissed. It still bothers me a little that there isn't a strictly conforming implementation of to-number in decimal, but I can live with it. |
|
|
msg59688 - (view) |
Author: Guido van Rossum (gvanrossum) *  |
Date: 2008-01-11 01:16 |
> Allowing leading and trailing whitespace causes failures in Cowlishaw's > test suite. I guess Raymond's right: this bug report should be > dismissed. It still bothers me a little that there isn't a strictly > conforming implementation of to-number in decimal, but I can live with it. Not sure what you mean. Can't you fix the code to reject the trailing \n? Or are we talking about something else now? |
|
|
msg59689 - (view) |
Author: Mark Dickinson (mark.dickinson) *  |
Date: 2008-01-11 01:49 |
I can certainly fix the Decimal constructor to reject trailing newlines, if that's what people want, but that doesn't fit with the proposal to accept and strip leading and trailing whitespace. The other option that maintains full compliance with the specification is to do what we like with Decimal.__new__ (e.g. allowing leading and trailing whitespace), but make sure that there's a fully conforming to-number elsewhere in the Decimal module. I'm happy to make either of the above fixes, or to do nothing (which leaves us with a slightly buggy implementation of the specification---but probably not so that anyone would notice). I'll leave the decision of which of these three options is most desirable to the more seasoned developers. |
|
|
msg59690 - (view) |
Author: Guido van Rossum (gvanrossum) *  |
Date: 2008-01-11 01:55 |
I'd say that accepting a trailing \n is a bug that should be fixed. I think that ideally we'd be allowed to specify whitespace around the value. I am always annoyed at programs that require me to type e.g. an integer and don't let me put spaces before and/or after (which often happen due to copy/paste). See also bug #1779 BTW. |
|
|
msg59703 - (view) |
Author: Facundo Batista (facundobatista) *  |
Date: 2008-01-11 11:26 |
Mark Dickinson: > The other option that maintains full compliance with the > specification is to do what we like with Decimal.__new__ > (e.g. allowing leading and trailing whitespace), but > make sure that there's a fully conforming to-number > elsewhere in the Decimal module. I'm +1 to be more permissive in the __new__ regarding spaces, tabs, and even newlines. I'm -0 to add an special module that does not allow this. I don't see the value of it more than be compliant to the standard in that particular sentence. Guido van Rossum: > I'd say that accepting a trailing \n is a bug that should > be fixed. > > I think that ideally we'd be allowed to specify whitespace > around the value. I am always annoyed at programs that require Do you want to be able to do Decimal("3 ") but not Decimal("3\n")? That confused me, I always read "whitespace" here as all these characters, as in the string module: >>> string.whitespace '\t\n\x0b\x0c\r ' To be more practical about my point, I'm +1 to do a .strip() in __new__ before parsing the number. |
|
|
msg59723 - (view) |
Author: Guido van Rossum (gvanrossum) *  |
Date: 2008-01-11 17:30 |
> Do you want to be able to do Decimal("3 ") but not Decimal("3\n")? I want either both or none, with a slight preference for both but only if it can be done without breaking the spec. The status quo is that "3 " is refused but "3\n" is accepted; that seems wrong. |
|
|
msg59770 - (view) |
Author: Mark Dickinson (mark.dickinson) *  |
Date: 2008-01-12 01:15 |
Here's a patch that alters the Decimal constructor to allow leading and trailing whitespace. The Context.create_decimal method should now be a fully conforming implementation of to-number: it doesn't accept any leading or trailing whitespace. |
|
|
msg59771 - (view) |
Author: Raymond Hettinger (rhettinger) *  |
Date: 2008-01-12 01:30 |
Please apply the patch and close the bug. Thx |
|
|
msg59776 - (view) |
Author: Mark Dickinson (mark.dickinson) *  |
Date: 2008-01-12 01:56 |
Committed, revision 59929. |
|
|