[Python-3000] Set literals - another try (original) (raw)

Talin talin at acm.org
Tue Aug 8 18:49:08 CEST 2006


Part 1: The concrete proposal part.

I noticed that a lot of folks seemed to like the idea of making the empty set resemble the greek letter Phi, using a combination of parentheses and the vertical bar or forward slash character.

So lets expand on this: slice Phi in half and say that (| and |) are delimiters for a set literal, as follows:

(|)     # Empty set

(|a|)   # Set with 1 item

(|a,b|) # Set with 2 items

The advantage of this proposal is that it maintains visual consistency between the 0, 1, and N element cases.

Part 2: The idle speculation part, not to be considered as a actual proposal.

I've often said that "whenever a programmer has the urge to invent a new programming language, that they should lie down on the couch until the feeling passes".

One of the reasons for this is that many times, a programmer's motivation in creating a new language is not that they actually need a new language, but rather as a means of criticising an existing language. Inventing their own language gives them the opportunity to show how they would have done it.

I think that kind of criticism can be valid, and that languages invented for this purpose can be useful, as long as you don't actually sit down and try to implement the thing.

As a thought experiment, I decided to apply this idea to the Python set literal case - i.e. if we were going to do a massive "do over" of Python, how would we approach the problem of set literals?

The syntax that comes to mind is something like this:

a = b|c

Where the vertical bar character means "forms a set with". Larger sets could be made using the same syntax:

a = b|c|c|d

You can also wrap parens around the set if you want:

a = (b|c)

Like tuples, a set with a single member still requires at least one delimiter:

a = (b|)

And the for the empty set, we're back to phi again:

a = (|)

However, the parens aren't generally required - the rules are pretty much the same as for tuples and the comma operator. Thus, passing sets as arguments:

index = s.find_first_of( 'a'|'b'|'c'|'d' )

Of course, by doing this, we're re-assigning the meaning of the '|' operator from 'bitwise or' to 'set construction'. This only makes sense if you assume that either (a) set construction is more common than bitwise-or operations or (b) you provide some reasonable alternative way to express bitwise-or operations. Lets assume that we create some reasonable replacement and move on.

Another thing to note is that the set construction operator resembles in some ways the "alternative" operator of BNF notation. In the previous example, 'find_first_of' looks for the first of the given alternatives.

Since dictionaries are similar to sets, we can represent a dictionary as a set of keys and associated values. Dictionary literals already use the ':' operator to indicate a key - we can continue that with:

a = ('Monday':1 | 'Tuesday':2 | 'Wednesday':3)

Unlike the current language, however, you can omit the parens:

a = 'Monday':1 | 'Tuesday':2 | 'Wednesday':3

(This creates a syntax ambiguity with colon, but let's move on :)

One of the fun things about this line of speculation is watching how such a tiny change ripples outward, affecting the entire language definition. In this case, the change to set construction has much farther-reaching effects than what I have described here, assuming that you take each effect to its logical conclusion. I find it an enjoyable mental excersize :)

-- Talin



More information about the Python-3000 mailing list