The textwrap module goes to great lengths to "do the right thing" when it finds the ASCII simulation of an em-dash (two or more consecutive hyphens), but it does nothing to recognize and similarly treat true (Unicode) em-dashes (aka '\N{EM DASH}', '\u2014', or U+2014). Real em-dashes should get at least as good a treatment as simulated em-dashes.
This seems sensible to me (I haven't looked at the PR, I'm talking about adding the support). When textwrap was written python was pretty ascii oriented, so it is not too much of a surprise that unicode em dashes were not supported.
Agreed. It makes great sense that textwrap started as highly ASCII-centric. But in the Python 3, Unicode-friendly era, ASCII-biased isn't where we should leave things.
> Agreed. It makes great sense that textwrap started as highly ASCII-centric. But in the Python 3, Unicode-friendly era, ASCII-biased isn't where we should leave things. It needs Unicode experts. If we support Unicode, we should implemente UAX #14. http://www.unicode.org/reports/tr14/tr14-45.html But I am not sure some core developer love textwrap and Unicode enough to implement it. It can be implemented in 3rd party package before adding it in stdlib. Then, is U+2014 really important to implement even though we can not implement UAX#14 in foreseeable future? It doesn't make sense to me.