Type aliases without implicit conversion ("NewType") · Issue #1284 · python/mypy (original) (raw)
Several users (@wittekm among them) have asked for a feature where you can define something that's like a type alias in referring to an existing underlying type at runtime, but which isn't considered equivalent to it in the type-checker -- like what Hack and Haskell call newtype
.
One classic application of this would be aliases/newtypes of str
(or unicode
or bytes
) to distinguish fragments of HTML, Javascript, SQL, etc., from arbitrary text and enforce that conversions are only done with proper escaping, to prevent classes of vulnerabilities like XSS and SQL injection. A definition might look like HtmlType = NewType("HtmlType", str)
.
Other classic uses include distinguishing identifiers of different things (users, machines, etc.) that are all just integers, so they don't get mixed up by accident.
A user can always just define a class, say an empty subclass of the underlying type, but if an application is handling a lot of IDs or text fragments or the like, it costs a lot at runtime for them to be some other class rather than actual str
or int
, so that that isn't a good solution.
The main open question I see in how this feature might work is how to provide for converting these values to the underlying type. For a feature like this to be useful there has to be some way to do that -- so it's possible to write the conversion functions that take appropriate care like escaping text into HTML -- but preferably one that's private to a limited stretch of code, or failing that is at least easy to audit with grep
. In Hack the types are implicitly equivalent just within the source file where the newtype is defined; in Haskell the newtype comes with a data constructor which is the only way to convert, and typically one just doesn't export that from the module where it's defined.
The Hack solution could work, and has the advantage that it means no run-time overhead at all, other than invoking the intended conversion functions that live in the newtype's file. It feels odd to me, though, because Python generally doesn't treat specially whether a thing is defined in a given module vs. imported into it. Another solution could be something like html = HtmlType.make(text)
and text = HtmlType.unmake(html)
, which would follow perfectly normal Python scoping and would be reasonably auditable.