Issue 16669: Docstrings for namedtuple (original) (raw)

Created on 2012-12-12 17:31 by serhiy.storchaka, last changed 2022-04-11 14:57 by admin. This issue is now closed.

Messages (21)

msg177381 - (view)

Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer)

Date: 2012-12-12 17:31

Here are two patches which implementation two different interface for same feature.

In first patch you can use doc and field_docs arguments to specify namedtuple class docstring and field docstrings. For example:

Point = namedtuple('Point', 'x y',
                   doc='Point: 2-dimensional coordinate',
                   field_docs=['abscissa', 'ordinate'])

In second patch you can use doc argument to specify namedtuple class docstring and field_names argument can be a sequence of pairs: field name and field docstring. For example:

Point = namedtuple('Point', [('x', 'absciss'), ('y', 'ordinate')],
                   doc='Point: 2-dimensional coordinate')

What approach is better?

Feel free to correct a documentation. I know that it need a correction.

msg177393 - (view)

Author: Raymond Hettinger (rhettinger) * (Python committer)

Date: 2012-12-12 23:51

I don't think it is worth complicating the API for this. There have been zero requests for this functionality. Even the doc field of property() is rarely used.

msg177418 - (view)

Author: Eric Snow (eric.snow) * (Python committer)

Date: 2012-12-13 16:59

What is wrong with the following?

class Point(namedtuple('Point', 'x y')): """A 2-dimensional coordinate

x - the abscissa
y - the ordinate

"""

This seems more clear to me. namedtuple is in some ways a quick-and-dirty type, essentially a more true implementation of the intended purpose of tuple. The temptation is to keep adding on functionality but we should resist until there is too much imperative. I don't see it here. While I don't have a gauge of how often people use (or would use) docstrings with nametuple, I expect that it's relatively low given the intended simplicity of namedtuple.

msg177434 - (view)

Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer)

Date: 2012-12-13 19:28

Yes, we can use inheritance trick/idiom to specify a class docstring. But there are no way to specify attribute docstrings.

I encountered this when rewriting some C implemented code to Python. PyStructSequence allows you to specify docstrings for a class and attributes, but namedtuple does not.

msg177470 - (view)

Author: Giampaolo Rodola' (giampaolo.rodola) * (Python committer)

Date: 2012-12-14 17:08

I don't think it is worth complicating the API for this. There have been zero requests for this functionality. Even the doc field of property() is rarely used.

+1

msg177560 - (view)

Author: Terry J. Reedy (terry.reedy) * (Python committer)

Date: 2012-12-15 20:32

I think this should be rejected and closed since the 'enhancement' looks worse to me than what we can do now.

  1. Most data attributes cannot have individual docstrings, so I expect the class docstring to list and possibly explain the data attributes.

  2. In the process of responding to #16670, I finally read the namedtuple doc. I notice that it already generates default one-line .doc attributes for both the class and properties. For Point, the class docstring is 'Point(x, y)', which will often be good enough.

  3. If the person creating the class does not think this sufficient, the replacement is likely to be multiple lines. This is awkward for a constructor argument. There is a reason we put docstrings after the header, not in the header.

  4. The class docstring is easily replaced by assignment. So I would write Eric's example as

Point = namedtuple('Point', 'x y') Point.doc = '''
A 2-dimensional coordinate

x - the abscissa y - the ordinate'''

This does not create a second new class and is not a 'trick'.

  1. The property docstrings have the form 'Alias for field number 0'. I do not consider replacing them an issue. If a true data attribute is replaced by a property, the act of replacement should be transparent. That is the point of properties. So there is no expectation that the attribute should suddenly grow a docstring, I presume that is why property docstrings are not used much. The default for named tuples gives information that is peculiarly relevant to named tuples and that should be generated automatically. As I said before, I think the prose explanation of field names belongs in the class doc.

msg177577 - (view)

Author: Eric Snow (eric.snow) * (Python committer)

Date: 2012-12-16 02:32

+1, Terry

msg177592 - (view)

Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer)

Date: 2012-12-16 12:08

  1. Most data attributes cannot have individual docstrings, so I expect the class docstring to list and possibly explain the data attributes.

But almost all PyStructSequence field have individual docstrings.

This does not create a second new class and is not a 'trick'.

Thanks for the tip.

I presume that is why property docstrings are not used much.

Indeed, only 84 of 336 Python implemented properties have docstrings . However this is even larger percent than for methods (about 8K of 43K). And 100 of 115 PyStructSequence field have docstrings.

I think Python should have more docstrings, not less.

msg205249 - (view)

Author: Guido van Rossum (gvanrossum) * (Python committer)

Date: 2013-12-04 21:41

I don't know if it's worth reopening this, but I had a need for generating docs including attribute docstrings for a namedtuple class using Sphinx, and I noticed a few things...

(1) Regarding there not being demand: There's a StackOverflow question for this with 17 "ups" on the question and 22 on the best answer: http://stackoverflow.com/questions/1606436/adding-docstrings-to-namedtuples-in-python

(2) The default autodocs produced by sphinx look dreadful (e.g. https://www.dropbox.com/s/nakxsslhb588tu1/Screenshot%202013-12-04%2013.29.13.png) -- note the duplication of the class name, the line break before the signature, and the listing of attributes in alphabetical order with useless boilerplate. Here's what I would like to produce: (though there's probably too much whitespace :-): https://www.dropbox.com/s/j11uismbeo6rrzx/Screenshot%202013-12-04%2013.31.44.png

(3) In Python 2.7 you can't assign to the doc class attribute.

I would really appreciate some way to set the docstring for the class as a whole as well as for each property, so they come out correct in Sphinx (and help()), preferably without having to manually assign doc strings or write the class by hand without using namedtuple at all. (The latter will become very verbose, each property has to look like this:

@property
def handle(self):
    """The datastore handle (a string)."""
    return self[1]

)

msg205269 - (view)

Author: Terry J. Reedy (terry.reedy) * (Python committer)

Date: 2013-12-05 01:40

Serhiy: I am not familiar with C PyStructSequence and how an instance of one appears in Python code. I agree that more methods should have docstrings.

Guido:

  1. I posted on SO the simple Py 3 solution that replaces the previously posted wrapper solutions needed for Py 2.

  2. Much of what you do not like is standard Sphinx/help behavior that would be unchanged by Serhiy's patch. The first line for a class is always "class ()". The first line is followed by the docstring, so the class name is repeated if and only if it is repeated in the docstring (as for list, see below). The new/init signature is given here if and only it is in the docstring. Otherwise, one has to look down for the method. The method signatures are never on the first line. Examples:

help(list) Help on class list in module builtins:

class list(object) | list() -> new empty list | list(iterable) -> new list initialized from iterable's items ...

class C: "doc string" def init(self, a, b): pass

help(C) Help on class C in module main:

class C(builtins.object) | doc string | | Methods defined here: |
| init(self, a, b) ...

  1. ?? Python 3 has many improvements and we will add more.

I am still of the opinion that property usage should be a mostly transparent implementation detail. Point classes could have 4 instance attributes: x, y, r, and theta, with a particular implementation using 0 to 4 properties. All attributes should be documented regardless of the number of properties, which currently means listing them in the class docstring. A library could have more than one than one implementation.

As for named tuples, I believe (without trying) that the name to index mapping could be done with gettattr and a separate dict. If so, there would be no property docstrings and hence no field docstrings to worry about ;-).

There have been requests for data attribute docstrings (without the bother and inefficiency of replacing a simple attribute with a property). Since such a docstring would have to be attached to the fixed attribute name, rather than the variable attribute value, I believe a string subclass would suffice, to be used as needed. The main problem is a decent syntax to add a docstring to a simple (assignment) statement.

If the general problem were solved, I would choose Serhiy's option B for namedtuple.

msg205271 - (view)

Author: Guido van Rossum (gvanrossum) * (Python committer)

Date: 2013-12-05 04:05

On Wed, Dec 4, 2013 at 5:40 PM, Terry J. Reedy <report@bugs.python.org> wrote:

  1. I posted on SO the simple Py 3 solution that replaces the previously posted wrapper solutions needed for Py 2.

Thanks, that will give people some pointers for Python 3. We need folks to upvote it. :-)

  1. Much of what you do not like is standard Sphinx/help behavior that would be unchanged by Serhiy's patch. The first line for a class is always "class ()".

Maybe for help(), but the Sphinx docs look better for most classes. Compare my screen capture with the first class on this page: https://www.dropbox.com/static/developers/dropbox-python-sdk-1.6-docs/index.html The screen capture looks roughly like this (note this is two lines and the word DatastoreInfo is repeated -- that wasn't line folding):

class dropbox.datastore.DatastoreInfo DatastoreInfo(id, handle, rev, title, mtime)

whereas for non-namedtuple classes it looks like this:

class dropbox.client.DropboxClient(oauth2_access_token, locale=None, rest_client=None)ΒΆ

I understand that part of this is due to the latter class having an init with a reasonable docstring, but the fact remains that namedtuple's default docstring produces poorly-looking documentation.

The first line is followed by the docstring, so the class name is repeated if and only if it is repeated in the docstring (as for list, see below). The new/init signature is given here if and only it is in the docstring. Otherwise, one has to look down for the method. The method signatures are never on the first line. Examples:

help(list) Help on class list in module builtins:

class list(object) | list() -> new empty list | list(iterable) -> new list initialized from iterable's items ...

class C: "doc string" def init(self, a, b): pass

help(C) Help on class C in module main:

class C(builtins.object) | doc string | | Methods defined here: | | init(self, a, b) ...

Yeah, help() is different than Sphinx. (As a general remark I find the help() output way too verbose with its endless listing of all the built-in behaviors.)

  1. ?? Python 3 has many improvements and we will add more.

I am still of the opinion that property usage should be a mostly transparent implementation detail.

What does that mean?

Point classes could have 4 instance attributes: x, y, r, and theta, with a particular implementation using 0 to 4 properties. All attributes should be documented regardless of the number of properties, which currently means listing them in the class docstring. A library could have more than one than one implementation.

For various reasons (like consistency with other classes) I really want the property docstrings on the individual properties, not in the class docstring. Here's a screenshot of what I want:

https://www.dropbox.com/s/70zfapz8pcz9476/Screenshot%202013-12-04%2019.57.36.png

I obtained this by abandoning the namedtuple and hand-coding properties -- the resulting class uses 4 lines (+ 1 blank) of boilerplate per property instead of just one line of docstring per property.

As for named tuples, I believe (without trying) that the name to index mapping could be done with gettattr and a separate dict. If so, there would be no property docstrings and hence no field docstrings to worry about ;-).

I'm not sure what you are proposing here -- a patch to namedtuple or a work-around? I think namedtuple is too valuable to abandon. It not only saves a lot of code, it captures the regularity of the code. (If I have a class with 5 similar-looking methods it's easy to overlook a subtle difference in one of them.)


There have been requests for data attribute docstrings (without the bother and inefficiency of replacing a simple attribute with a property). Since such a docstring would have to be attached to the fixed attribute name, rather than the variable attribute value, I believe a string subclass would suffice, to be used as needed. The main problem is a decent syntax to add a docstring to a simple (assignment) statement.

Sphinx actually has a syntax for this already. In fact, it has three: it allwos a comment before or on the class variable starting with "#:", or a docstring immediately following. Check out this documentation for the autodoc extension: http://sphinx-doc.org/ext/autodoc.html#directive-autoattribute

If the general problem were solved, I would choose Serhiy's option B for namedtuple.

If you're referring to this:

Point = namedtuple('Point', [('x', 'absciss'), ('y', 'ordinate')],
                   doc='Point: 2-dimensional coordinate')

I'd love it!

msg205277 - (view)

Author: Terry J. Reedy (terry.reedy) * (Python committer)

Date: 2013-12-05 06:25

I find the help() output way too verbose with its endless listing of all the built-in behaviors.)

Then you might agree to a patch, on a separate issue. Let's set help aside for the moment.

I am familiar with running Sphinx on .rst files, but not on docstrings. It looks like the docstrings use .rst markup. (Is this allowed in the stdlib?) (The output looks good enough for a first draft of a tkinter class/method reference, which I would like to work on.)

I understand that part of this [signature after class name] is due to the latter class having an init with a reasonable docstring

If dropbox.client is written in Python, as I presume, then I strongly suspect that the signature part of class dropbox.client.DropboxClient( oauth2_access_token, locale=None, rest_client=None) comes from an inspect module method that examines the function attributes other than .doc. If so, DropboxClient.init docstring is irrelevant to the above. You could test by commenting it out and rerunning the doc build.

The inspect methods do not work on C-coded functions (unless Argument Clinic has fixed this for 3.4), which is why signatures are put in the docstrings for C-coded objects. For C-coded classes, it is put in the class docstring rather than the class.init docstring.

but the fact remains that namedtuple's default docstring produces poorly-looking documentation.

'x.init(...) initializes x; see help(type(x)) for signature'

This is standard boilerplate for C-coded .init.doc. Raymond just copied it.

int.init.doc 'x.init(...) initializes x; see help(type(x)) for signature' list.init.doc 'x.init(...) initializes x; see help(type(x)) for signature'

I will try to explain 'property transparency/equivalence' in another post, when I am fresher, and after reading the autodoc reference, so you can understand enough to agree or not. My reference to a possible alternate implementation of named tuple was part of the failed explanation of 'property transparency'. I am not proposing a change now.

msg205317 - (view)

Author: Guido van Rossum (gvanrossum) * (Python committer)

Date: 2013-12-05 18:36

On Wed, Dec 4, 2013 at 10:25 PM, Terry J. Reedy <report@bugs.python.org> wrote:

I am familiar with running Sphinx on .rst files, but not on docstrings. It looks like the docstrings use .rst markup. (Is this allowed in the stdlib?)

I'm not sure if it is allowed, but it is certainly used plenty in some modules (perhaps those that started life as 3rd party packages).

(The output looks good enough for a first draft of a tkinter class/method reference, which I would like to work on.)

I won't stop you -- having any kind of docs for Tkinter sounds good to me!

I understand that part of this [signature after class name] is due to the latter class having an init with a reasonable docstring

If dropbox.client is written in Python, as I presume,

It is.

then I strongly suspect that the signature part of class dropbox.client.DropboxClient( oauth2_access_token, locale=None, rest_client=None) comes from an inspect module method that examines the function attributes other than .doc.

Indeed.

If so, DropboxClient.init docstring is irrelevant to the above. You could test by commenting it out and rerunning the doc build.

Yes.

The inspect methods do not work on C-coded functions (unless Argument Clinic has fixed this for 3.4), which is why signatures are put in the docstrings for C-coded objects. For C-coded classes, it is put in the class docstring rather than the class.init docstring.

Perhaps it doesn't understand new? namedtuple actually generates Python code for a class definition using a template and then uses exec() on the filled-in template; the template defines only new though.

but the fact remains that namedtuple's default docstring produces poorly-looking documentation.

'x.init(...) initializes x; see help(type(x)) for signature'

This is standard boilerplate for C-coded .init.doc. Raymond just copied it.

He didn't (it's not in the template). It is the dummy init that tuple inherits from object (the docstring is in the init wrapper in typeobject.c).

int.init.doc 'x.init(...) initializes x; see help(type(x)) for signature' list.init.doc 'x.init(...) initializes x; see help(type(x)) for signature'

msg205340 - (view)

Author: Terry J. Reedy (terry.reedy) * (Python committer)

Date: 2013-12-06 00:27

I think we can now agree that docstrings other than the class docstring (used as a fallback) are not relevant to signature detection. And Raymond gave namedtuple classes the docstring needed as a fallback.

We are off-issue here, but idlelib.CallTips.get_argspec() is also ignorant that it may need to look at .new. An object with a C-coded .init and Python-coded .new is new to new-style classes. The new inspect.signature function handles such properly. Starting with a namedtuple Point (without the default docstring):

from inspect import signature str(signature(Point.new)) '(_cls, x, y)' str(signature(Point)) '(x, y)'

The second is what autodoc should use. I just opened #19903 to update Idle to use signature.

msg205341 - (view)

Author: Guido van Rossum (gvanrossum) * (Python committer)

Date: 2013-12-06 00:31

It was never about signature detection for me -- what gave you that idea? I simply want to have the option to put individual docstrings on the properties generated by namedtuple.

msg205582 - (view)

Author: Ned Batchelder (nedbat) * (Python triager)

Date: 2013-12-08 16:41

I'll add my voice to those asking for a way to put docstrings on namedtuples. As it is, namedtuples get automatic docstrings that seem to me to be almost worse than none. Sphinx produces this:

class Key

    Key(scope, user_id, block_scope_id, field_name)

    __getnewargs__()

        Return self as a plain tuple. Used by copy and pickle.

    __repr__()

        Return a nicely formatted representation string

    block_scope_id None

        Alias for field number 2

    field_name None

        Alias for field number 3

    scope None

        Alias for field number 0

    user_id None

        Alias for field number 1

Why are __getnewargs__ and __repr__ included at all, they aren't useful for API documentation. The individual property docstrings offer no new information over the summary at the top. I'd like namedtuple not to be so verbose where it has no useful information to offer. The one-line summary is all the information namedtuple has, so that is all it should include in the docstring:

class Key

    Key(scope, user_id, block_scope_id, field_name)

msg205583 - (view)

Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer)

Date: 2013-12-08 16:46

Unhide this discussion.

msg205978 - (view)

Author: Raymond Hettinger (rhettinger) * (Python committer)

Date: 2013-12-12 20:29

A few quick thoughts:

msg242096 - (view)

Author: Raymond Hettinger (rhettinger) * (Python committer)

Date: 2015-04-27 04:04

FWIW, here's a proposed new classmethod that makes it possible to easily customize the field docstrings but without cluttering the API of the factory function:

@classmethod
def _set_docstrings(cls, **docstrings):
    '''Customize the field docstrings

       >>> Point = namedtuple('Point', ['x', 'y'])
       >>> Point._set_docstrings(x = 'abscissa', y = 'ordinate')
               
    '''
    for fieldname, docstring in docstrings.items():
        if fieldname not in cls._fields:
            raise ValueError('Fieldname %r does not exist' % fieldname)
        new_property = _property(getattr(cls, fieldname), doc=docstring)
        setattr(cls, fieldname, new_property)

Note, nothing is needed for the main docstring since it is already writeable:

 Point.__doc__ = '2-D Coordinate'

msg242106 - (view)

Author: Peter Otten (peter.otten) *

Date: 2015-04-27 10:00

Here's a variant that builds on your code, but makes for a nicer API. Single-line docstrings can be passed along with the attribute name, and with namedtuple.with_docstrings(... all info required to build the class ...) from a user perspective the factory looks like a class method:

from functools import partial from collections import namedtuple

def _with_docstrings(cls, typename, field_names_with_doc, *, verbose=False, rename=False, doc=None): field_names = [] field_docs = [] if isinstance(field_names_with_doc, str): field_names_with_doc = [ line for line in field_names_with_doc.splitlines() if line.strip()] for item in field_names_with_doc: if isinstance(item, str): item = item.split(None, 1) if len(item) == 1: [fieldname] = item fielddoc = None else: fieldname, fielddoc = item field_names.append(fieldname) field_docs.append(fielddoc)

nt = cls(typename, field_names, verbose=verbose, rename=rename)

for fieldname, fielddoc in zip(field_names, field_docs):
    if fielddoc is not None:
        new_property = property(getattr(nt, fieldname), doc=fielddoc)
        setattr(nt, fieldname, new_property)

if doc is not None:
    nt.__doc__ = doc
return nt

namedtuple.with_docstrings = partial(_with_docstrings, namedtuple)

if name == "main": Point = namedtuple.with_docstrings("Point", "x abscissa\ny ordinate") Address = namedtuple.with_docstrings( "Address", """ name Surname first_name First name

    city
    email Email address
    """)
Whatever = namedtuple.with_docstrings(
    "Whatever",
    [("foo", "doc for\n foo"),
     ("bar", "doc for bar"),
     "baz"],
    doc="""The Whatever class.

Example for a namedtuple with multiline docstrings for its attributes.""")

msg242121 - (view)

Author: Raymond Hettinger (rhettinger) * (Python committer)

Date: 2015-04-27 15:09

The need for this may be eliminated by issue 24064. Then we change the docstrings just like any other object with no special rules or methods.

History

Date

User

Action

Args

2022-04-11 14:57:39

admin

set

github: 60873

2015-05-13 08:12:42

rhettinger

set

status: open -> closed
resolution: rejected -> fixed

2015-04-27 15:09:41

rhettinger

set

messages: +

2015-04-27 15:08:21

rhettinger

set

messages: -

2015-04-27 15:00:39

rhettinger

set

messages: +

2015-04-27 14:58:04

rhettinger

set

messages: -

2015-04-27 14:56:38

rhettinger

set

messages: +

2015-04-27 10:00:17

peter.otten

set

nosy: + peter.otten
messages: +

2015-04-27 04:04:04

rhettinger

set

messages: +

2013-12-12 20:29:58

rhettinger

set

messages: +

2013-12-11 21:57:34

rhettinger

set

versions: + Python 3.5, - Python 3.4

2013-12-08 16:46:16

serhiy.storchaka

set

messages: +

2013-12-08 16:44:41

serhiy.storchaka

set

status: closed -> open

2013-12-08 16:41:49

nedbat

set

nosy: + nedbat
messages: +

2013-12-06 00:31:17

gvanrossum

set

messages: +

2013-12-06 00:27:32

terry.reedy

set

messages: +

2013-12-05 22:06:50

pconnell

set

nosy: + pconnell

2013-12-05 18:36:59

gvanrossum

set

messages: +

2013-12-05 06:25:10

terry.reedy

set

messages: +

2013-12-05 04:05:52

gvanrossum

set

messages: +

2013-12-05 01:40:20

terry.reedy

set

messages: +

2013-12-04 21:41:32

gvanrossum

set

nosy: + gvanrossum
messages: +

2013-03-08 07:26:08

rhettinger

set

status: open -> closed
resolution: rejected

2013-02-27 16:48:28

Ankur.Ankan

set

nosy: + Ankur.Ankan

2012-12-16 12:08:03

serhiy.storchaka

set

messages: +

2012-12-16 02:32:00

eric.snow

set

messages: +

2012-12-15 20:32:36

terry.reedy

set

nosy: + terry.reedy
messages: +

2012-12-14 17:08:07

giampaolo.rodola

set

nosy: + giampaolo.rodola
messages: +

2012-12-13 19:28:49

serhiy.storchaka

set

messages: +

2012-12-13 16:59:08

eric.snow

set

nosy: + eric.snow
messages: +

2012-12-12 23:51:45

rhettinger

set

priority: normal -> low
assignee: rhettinger
messages: +

2012-12-12 17:33:40

serhiy.storchaka

set

files: + namedtuple_docstrings_tuples_seq.patch

2012-12-12 17:31:35

serhiy.storchaka

create