msg166247 - (view) |
Author: Chris Jerdonek (chris.jerdonek) *  |
Date: 2012-07-23 20:07 |
This issue is to merge the Doc/ACKS and Misc/ACKS files as discussed here: http://mail.python.org/pipermail/python-dev/2012-July/121096.html |
|
|
msg166248 - (view) |
Author: Chris Jerdonek (chris.jerdonek) *  |
Date: 2012-07-23 20:32 |
I would be happy to prepare a patch. I can upload a script to this issue that the committer can then run on the latest Misc/ACKS and Doc/ACKS.txt. The script would preserve the ordering of Misc/ACKS. It would iterate through the names in Doc/ACKS.txt and insert them in Misc/ACKS at the appropriate location. Duplicates would not be inserted. |
|
|
msg166249 - (view) |
Author: Antoine Pitrou (pitrou) *  |
Date: 2012-07-23 20:41 |
Georg, do you think this is ok for all 3 branches? |
|
|
msg166251 - (view) |
Author: Éric Araujo (eric.araujo) *  |
Date: 2012-07-23 22:29 |
This was indeed proposed once or twice before; I can’t search my archive right now but I think I remember Georg saying that he was OK as long as the docs displayed Misc/ACKS. This means checking the rst syntax of Misc/ACKS and using the right include directive. |
|
|
msg166260 - (view) |
Author: Chris Jerdonek (chris.jerdonek) *  |
Date: 2012-07-24 02:41 |
Attached is a script that seems to do the job (except for the rst formatting, which can be added later. This was so that you can see by the diff what has changed). In the process of doing this, I found that Jeff McNeil is far out of order in Misc/ACKS, and possibly also Hugo Lopes Tavares and Xavier de Gaye, depending on what alphabetization rules should be used. The script contains logic to collect the non-ascii characters that appear in people's names, so that non-ascii characters can be approximated by ascii characters for ordering purposes (which seems to be how it is done now in some cases). In a subsequent comment, I will attach a diff that results from running the script, so you can see what effect it has on Misc/ACKS. |
|
|
msg166261 - (view) |
Author: Chris Jerdonek (chris.jerdonek) *  |
Date: 2012-07-24 02:44 |
Attaching sample output of running the script. |
|
|
msg166281 - (view) |
Author: Chris Jerdonek (chris.jerdonek) *  |
Date: 2012-07-24 12:01 |
I created a new issue 15439 for including the combined Misc/ACKS into the documentation (as Éric mentioned) because the nature of that discussion is different, and because the changes will be easier to observe and understand if committed separately. |
|
|
msg166291 - (view) |
Author: R. David Murray (r.david.murray) *  |
Date: 2012-07-24 13:45 |
I'm not clear if your script is trying to do this, but there is no way to automatically alphabetize the file. That's why it says "rough" alphabetic order. The issue is that different languages alphabetize different letters in different places. We try to respect the alphabetization of the source language as much as practical...which means there is no algorithm that can do the sorting, since the names in the file do not carry explicit language information. |
|
|
msg166294 - (view) |
Author: Antoine Pitrou (pitrou) *  |
Date: 2012-07-24 14:02 |
Well, the script output looks good (apart from a few duplicates which can be resolved by hand, e.g. "Terry Reedy" vs. "Terry J. Reedy"). |
|
|
msg166295 - (view) |
Author: Chris Jerdonek (chris.jerdonek) *  |
Date: 2012-07-24 14:15 |
I did think through those issues and made a special effort to address them in the script. For starters, the script does not change the order of any names in Misc/ACKS. This is to preserve the existing rough alphabetical ordering, and to ensure that the diff consists only of insertions (for easier manual checking, if desired). As for inserting new names in rough alphabetical order, I dealt with different language characters as follows. The script has a translation table to map non-ascii characters to ascii characters for sorting purposes. Currently, that table is as follows (I'm not sure if all of these characters will render on the page): NON_ASCII = "ÅÉØáäåæçéëíñóôöùúüćęŁńŽКМСабгекнорш“”" ASCII_SUB = 'AEOaaaaceeinooouuuceLnZKMCabrekhopw""' This mapping can easily be modified if my initial choices are not the best. As an early step, the script collects all non-ascii characters that appear in all names to make sure the translation table is up to date (exiting with a message otherwise). When I said "Jeff McNeil" is out of order, that was because the name appears after "Jeff Epler" but before "Tom Epperly". The script maintains a list of "out of order" names like this to skip when inserting, to prevent insertions from being out of rough alphabetical order. If different languages use a different ordering on the word level, the script will not handle that, however. It only orders lexicographically by last name, and then first name(s). Much of this information is spelled out in the script's docstring. |
|
|
msg166296 - (view) |
Author: Chris Jerdonek (chris.jerdonek) *  |
Date: 2012-07-24 14:20 |
That is correct, Antoine. Duplicates need to be removed by hand. To assist in this process, the script currently prints "possible duplicates" to stdout after running. However, the script could easily be modified to display an in-line indicator before possible duplicates to make this manual step easier, e.g.: John Redford Terry Reedy +>>> Terry J. Reedy Gareth Rees Currently, possible duplicates are determined based on whether the last name matches an already existing last name. |
|
|
msg166298 - (view) |
Author: Antoine Pitrou (pitrou) *  |
Date: 2012-07-24 14:22 |
> To assist in this process, the script currently prints "possible > duplicates" to stdout after running. However, the script could easily > be modified to display an in-line indicator before possible duplicates > to make this manual step easier, e.g.: > > John Redford > Terry Reedy > +>>> Terry J. Reedy > Gareth Rees Well, no need to be perfectionist IMO. The merging will only be done once (thrice if we count all branches :-)). |
|
|
msg166321 - (view) |
Author: Martin v. Löwis (loewis) *  |
Date: 2012-07-24 19:09 |
Also, if you want to do phonetic translation of non-ASCII, then абгекнор really matches abgeknor, and ш is transliterated to "sh" in English (IIUC) (to "sch" in German). But I agree that this is best done manually. What matters is what the script produces; the script certainly won't make it into Python's source code. I'm sure Chris had fun writing it. |
|
|
msg166328 - (view) |
Author: Chris Jerdonek (chris.jerdonek) *  |
Date: 2012-07-24 19:48 |
Yes, I did. Even though it is throw-away. By the way, I'm taking Antoine's advice to avoid perfectionism on this. Otherwise I'd include your suggestion re: the special characters. :) |
|
|
msg166411 - (view) |
Author: Martin v. Löwis (loewis) *  |
Date: 2012-07-25 16:41 |
I don't think the docs should display Misc/ACKS. Instead, I propose the following wording "Many people have contributed to the Python language, the Python standard library, and the Python documentation. See Misc/ACKS in the Python source distribution for a partial list of contributors" It might be useful to link "Misc/ACKS" to http://hg.python.org/cpython/file/default/Misc/ACKS (http://hg.python.org/cpython/raw-file/default/Misc/ACKS would be better if hgweb wouldn't declare that application/octet-stream) |
|
|
msg166420 - (view) |
Author: Éric Araujo (eric.araujo) *  |
Date: 2012-07-25 18:55 |
We can just use :source:`Misc/ACKS` and it will created a link to hgweb (the colored HTML page, not the raw file). |
|
|
msg167588 - (view) |
Author: Chris Jerdonek (chris.jerdonek) *  |
Date: 2012-08-06 22:31 |
Is this issue awaiting feedback from anyone else before it can proceed further? (Just this issue and not issue 15439 to make any adjustments to the docs.) I am attaching an updated diff after generating the script output again against the tip (modified to prefix matching last names with '>>> '). |
|
|
msg167589 - (view) |
Author: Chris Jerdonek (chris.jerdonek) *  |
Date: 2012-08-06 22:36 |
For completeness, I am attaching the modified version of the script that was used to generate the latest output. |
|
|
msg170134 - (view) |
Author: Chris Jerdonek (chris.jerdonek) *  |
Date: 2012-09-09 20:49 |
I was reminded of this issue by the following e-mail today: http://mail.python.org/pipermail/python-dev/2012-September/121639.html I updated the script I attached earlier to ensure that it can also be run against the names in 2.7 (attaching now as script #3). I also checked that this latest script can still be run against 3.2 and default with the names that have been added since the last time I checked. Let me know if you would like any assistance in how to run the script and what to check for, etc. |
|
|
msg170425 - (view) |
Author: Chris Jerdonek (chris.jerdonek) *  |
Date: 2012-09-13 00:12 |
Just an FYI that Ezio asked Georg about this issue on IRC yesterday or the day before, and Georg said +1. |
|
|
msg170462 - (view) |
Author: Roundup Robot (python-dev)  |
Date: 2012-09-13 22:59 |
New changeset 48185b0f7b8a by Ezio Melotti in branch '3.2': #15437, #15439: merge Doc/ACKS.txt with Misc/ACKS and modify Doc/about.rst accordingly. http://hg.python.org/cpython/rev/48185b0f7b8a New changeset 2b4a89f82485 by Ezio Melotti in branch 'default': #15437, #15439: merge with 3.2. http://hg.python.org/cpython/rev/2b4a89f82485 New changeset 76dd082d332e by Ezio Melotti in branch '2.7': #15437, #15439: merge Doc/ACKS.txt with Misc/ACKS and modify Doc/about.rst accordingly. http://hg.python.org/cpython/rev/76dd082d332e |
|
|
msg170465 - (view) |
Author: Ezio Melotti (ezio.melotti) *  |
Date: 2012-09-13 23:11 |
Fixed, thanks for the script! |
|
|
msg170466 - (view) |
Author: Chris Jerdonek (chris.jerdonek) *  |
Date: 2012-09-13 23:16 |
Thanks for committing, Ezio! |
|
|