Issue 39170: Sqlite3 row_factory for attribute access: NamedRow (original) (raw)

Currently, sqlite3 returns rows by tuple or sqlite3.Row for dict-style, index access. I constantly find myself wanting attribute access like namedtuple for rows. I find attribute access cleaner without the brackets and quoting field names. However, unlike previous discussions (https://bugs.python.org/issue13299), I don't want to use the namedtuple object. I appreciate the simple API and minimal memory consumption of sqlite3.Row and used it as my guide in creating sqlite3.NamedRow to allow access by index and attribute.

A pull request is ready

Why a new object instead of adding attribute access to the existing sqlite3.Row? There is an existing member method keys and any table with the field "keys" would cause a hard to debug, easily avoidable, collision.

Features:

Optimized in C, so it will be faster than any python implementation.
Access columns by attribute for all valid names and by index for all names.
Iterate over fields by name/value pairs.
Works with standard functions len and contains.
Identical memory consumption to sqlite3.Row with two references: the data tuple and the cursor description.
Identical speed to sqlite3.Row if not faster. Timing usually has it slightly faster for index by name or attribute, but it is almost identical.

Examples:

>>> import sqlite3
>>> c = sqlite3.Connection(":memory:").cursor()
>>> c.row_factory = sqlite3.NamedRow
>>> named_row = c.execute("SELECT 'A' AS letter, '.-' AS morse, 65 AS ord").fetchone()

>>> len(named_row)
3
>>> 'letter' in named_row
true
>>> named_row == named_row
true
>>> hash(named_row)
5512444875192833987

Index by number and range. >>> named_row[0] 'A' >>> named_row[1:] ('.-', 65)

Index by column name. >>> named_row["ord"] 65

Access by attribute. >>> named_row.morse '.-'

Iterate row for name/value pairs. >>> dict(named_row) {'letter': 'A', 'morse': '.-', 'ord': 65} >>> tuple(named_row) (('letter', 'A'), ('morse', '.-'), ('ord', 65))

How sqlite3.NamedRow differs from sqlite3.Row

The class only has class dunder methods to allow any valid field name. When the field name would be an invalid attribute name, you have two options: either use the SQL AS in the select statement or index by name. To get the field names use the iterator [x[0] for x in row] or do the same from the cursor.description.

    titles = [x[0] for x in row]
    titles = [x[0] for x in cursor.description]
    titles = dict(row).keys()

Attribute and dict access are no longer case-insensitive. There are three reasons for this. 1. Case-insensitive comparison only works well for ASCII characters. In a Unicode world, case-insensitive edge cases create unnecessary errors. Looking at a several existing codebases, this feature of Row is almost never used and I believe is not needed in NamedRow. 2. Case-insensitivity is not allowed for attribute access. This "feature" would treat attribute access differently from the rest of Python and "special cases aren't special enough to break the rules". Where row.name, row.Name, and row.NAME are all the same it gives off the faint code smell of something wrong. When case-insensitively is needed and the query SELECT can not be modified, sqlite3.Row is still there. 3. Code is simpler and easier to maintain. 4. It is faster.

Timing Results

NamedRow is faster than sqlite3.Row for index-by-name access. I have published a graph and the methodology of my testing. In the worst-case scenario, it is just as fast as sqlite3.Row without any extra memory. In most cases, it is faster. For more information, see the post at http://jidn.com/2019/10/namedrow-better-python-sqlite3-row-factory/