msg50131 - (view) |
Author: John J Lee (jjlee) |
Date: 2006-04-29 16:43 |
The patch takes advantage of the exception hierarchy reorganisation to remove some ugly code in cookielib. It clarifies a couple of exception messages, too. |
|
|
msg50132 - (view) |
Author: Brett Cannon (brett.cannon) *  |
Date: 2006-04-29 21:12 |
Logged In: YES user_id=357491 Overall the patch looks fine (on vacation so not up for applying and handling any possible failures so not going to assign to myself). But a question and a suggestion. Why were the error strings changed to use the repr instead of the string representation? What does it buy you? And if you are going to be changing the function name, you might want to consider using a leading underscore to prevent people from using it or getting exported. Otherwise I would define __all__ for the module. |
|
|
msg50133 - (view) |
Author: John J Lee (jjlee) |
Date: 2006-04-30 00:27 |
Logged In: YES user_id=261020 I changed the exception detail strings to use %r to get quotes around the filename and quoted "bad" line read from the file. That makes it clearer what is part of the explanatory English text and what is part of the filename (or part of the quoted bad line, as the case may be). Filenames can and do contain spaces, commas, etc. Your other point stunned me a bit. I don't think it had ever even really *occurred* to me that stdlib users might consider stdlib module globals that are not documented as public. Ironically, I think that's because the code from which cookielib derives is much stricter about this, all modules starting with '_' and package __init__ exporting a short list of names -- I guess I thought I was following stdlib conventions by *not* adding initial underscores all over the place. Looking at some other stdlib code, I see that underscores would have been more conventional after all. Searching for reassurance, I discovered this from one of your old python-dev summaries that confirms that undocumented stdlib module globals are not considered part of the module public interface: http://www.python.org/dev/summary/2004-07-16_2004-07-31/#use-the-docs-to-know-what-the-public-api-is-people e.g. from Tim Peters: """ As you noted later, it wasn't part of keyword's documented interface, and you *always* act at your own risk when you go beyond the docs. """ However, I don't see that this is explicitly documented, which seems unfortunate to me (even though Tim's statement is true regardless of any convention Python might have). So, I guess I should: 1. Write something explicit about this (along the lines of "Use undocumented module globals at your own risk") for the stdlib library docs -- perhaps starting from Tim's post -- and submit that as a doc patch. 2. Leave all module global names in cookielib unchanged (so people using those functions don't suffer gratuitous breakage, even though any such people are asking for trouble in the long run). However, in the thread above, Michael Hudson disagrees with that, and suggests all such module globals be renamed. So suggestions are welcome here on the best course of action. 3. As you suggest, submit a patch to add an __all__ to cookielib. |
|
|
msg50134 - (view) |
Author: Neal Norwitz (nnorwitz) *  |
Date: 2006-04-30 00:37 |
Logged In: YES user_id=33168 John, at this point (2.x) we should at least do no worse. Don't export unnecessary vars in any new code. We should also start documenting the situation and work towards improving it. For 3k, we should do better and solidify the rules and do massive cleanup (module by module). This will probably involve some arm twisting of Guido. :-) |
|
|
msg50135 - (view) |
Author: John J Lee (jjlee) |
Date: 2006-04-30 01:22 |
Logged In: YES user_id=261020 So, you (Neil) agree with my three numbered action points below? You repeat my suggestion that we document this as if it were a new suggestion; did you read my comment? Sigh, sorry for being a little grumpy about this, but it's hard to "do no worse" for a project if that project doesn't seem itself to be very sure what it considers "worse": While I must say I *agree* with you that such practices are not good, if I as somebody apparently unusually inclined to heavy use of underscores (even in most of my module names, in library code) actually thought, however foolishly, that I was *following stdlib conventions* by using *fewer* underscores (for reasons I'll try to refrain from debating further here), it does indeed seem pretty clear we're in need of explicit documentation on this! So your advice to "do no worse" is a little annoying at this point... :-) OK, so, what should get documented, specifically? And where should documentation for module authors go? a). Stdlib module authors should always use underscores for non-public module globals. b). Don't know about this one: should non-legacy stdlib modules (viz, those that follow rule a)) define __all__? (perhaps a point against doing this is that it may encourage import * ?). c). Stdlib packages should use __init__ to export public names. d). Any discrepancy between __all__ and the API documentation is a bug. e). Stdlib users should assume all non- (I guess this should be go somewhere near the library reference introduction) Finally, how about my point 2? Should I add underscores to cookielib module globals I consider non-public (== all undocumented module globals), or not? Thanks for the feedback, both of you! |
|
|
msg50136 - (view) |
Author: John J Lee (jjlee) |
Date: 2006-04-30 01:26 |
Logged In: YES user_id=261020 Bleh. e). Stdlib users should assume all undocumented module globals are not part of the public API (I guess this should be go somewhere near the library reference introduction) |
|
|
msg50137 - (view) |
Author: John J Lee (jjlee) |
Date: 2006-04-30 01:31 |
Logged In: YES user_id=261020 Hmm, perhaps by "do no worse" you simply meant not to rename the function in this tracker item to a name not beginning with an initial underscore (since that would introduce a new non-public module global that does not begin with an underscore). In which case, sorry for the rant. :-) My questions after the rant still stand. ;-) |
|
|
msg50138 - (view) |
Author: Brett Cannon (brett.cannon) *  |
Date: 2006-04-30 02:23 |
Logged In: YES user_id=357491 You should use an underscore if you don't have an __all__ defined. This is mostly for protection for ``from cookielib import *`` code. But if you define an __all__ I think you are fine. You do not need to document that undocumented globals should not be relied upon. Yes, people should know better (I personally got nailed by the Debian folk for an undocumented function in site.py that I changed the parameters of). But the suggestions Neal and I are making are to protect people who do their doc checking from the command-line and thus just do ``import cookielib; dir(cookielib)``. I know Python is for use by adults, but sometimes going a small step to protect the adolescents is also okay. =) |
|
|
msg50139 - (view) |
Author: John J Lee (jjlee) |
Date: 2006-04-30 14:35 |
Logged In: YES user_id=261020 I still think there should be documented guidelines on this for both stdlib users and contributors (though they should not live in the cookielib docs, of course!). I've added a new patch to this tracker item, cookielib_baseexception_2.patch, that adds an __all__ and uses the name _warn_unhandled_exception (in addition to the original patch contents). I verified that the set of documented module globals is identical to those listed in the new __all__. Let me know if I should also rename all other non-public module globals defined in cookielib to have initial underscores. Just for the record, here is the list of all module globals NOT listed in the new __all__ (as determined by doing dir(cookielib) prior to applying the patch, and removing the items I didn't add to __all__, so includes some noise from things like 'import sys'): ['Absent', 'DAYS', 'DEFAULT_HTTP_PORT', 'EPOCH_YEAR', 'ESCAPED_CHAR_RE', 'HEADER_ESCAPE_RE', 'HEADER_JOIN_ESCAPE_RE', 'HEADER_QUOTED_VALUE_RE', 'HEADER_TOKEN_RE', 'HEADER_VALUE_RE', 'HTTP_PATH_SAFE', 'IPV4_RE', 'ISO_DATE_RE', 'LOOSE_HTTP_DATE_RE', 'MISSING_FILENAME_TEXT', 'MONTHS', 'MONTHS_LOWER', 'STRICT_DATE_RE', 'TIMEZONE_RE', 'UTC_ZONES', 'WEEKDAY_RE', '__builtins__', '__doc__', '__file__', '__name__', '_str2time', '_threading', '_timegm', 'copy', 'cut_port_re', 'debug', 'deepvalues', 'domain_match', 'eff_request_host', 'escape_path', 'http2time', 'httplib', 'is_HDN', 'is_third_party', 'iso2time', 'join_header_words', 'liberal_is_HDN', 'logging', 'lwp_cookie_str', 'month', 'offset_from_tz_string', 'parse_ns_headers', 're', 'reach', 'request_host', 'request_path', 'request_port', 'split_header_words', 'sys', 'time', 'time2isoz', 'time2netscape', 'timegm', 'unmatched', 'uppercase_escaped_char', 'urllib', 'urlparse', 'user_domain_match', 'vals_sorted_by_key', 'warn_unhandled_exception'] |
|
|
msg50140 - (view) |
Author: John J Lee (jjlee) |
Date: 2006-04-30 14:38 |
Logged In: YES user_id=261020 (...and the new patch makes a tiny fix to a slightly-inaccurate statement in the module docstring) |
|
|
msg50141 - (view) |
Author: Georg Brandl (georg.brandl) *  |
Date: 2006-05-08 17:48 |
Logged In: YES user_id=849994 Applied in rev. 45940. |
|
|