[Python-3000] Draft pre-PEP: function annotations (original) (raw)

Talin talin at acm.org
Fri Aug 11 15:10:31 CEST 2006

Previous message: [Python-3000] Draft pre-PEP: function annotations
Next message: [Python-3000] Draft pre-PEP: function annotations
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

Collin Winter wrote:

The idea is that each developer can pick the notation/semantics that's most natural to them. I'll go even further: say one library offers a semantics you find handy for task A, while another library's ideas about type annotations are best suited for task B. Without a single standard, you're free to mix and match these libraries to give you a combination that allows you to best express the ideas you're going for.

Let me tell you a story.

Once upon a time, there was a little standard called Midi (Musical Instrument Digital Interface). The Midi standard was small and lightweight, containing less than a dozen commands of 2-3 bytes each. However, they realized that they needed a way to allow hardware vendors to add their own custom message types, so they created a special message type called "System Exclusive Message" or SysEx for short. The idea is that you would send a 3-byte manufacturer ID, and then any subsequent bytes would be considered to be in a vendor-specific format. The MMA (Midi Manufacturers Association) did not provide any guidelines or suggestions as to what the format of those bytes should be - it would be completely up to the vendors to decide what the format of their system exclusive message would be.

Since the Midi standard did not define a way to save and load the instrument's memory, vendors typically would use the SysEx message to allow a "bulk dump" of patch information - essentially it was a way to access the instrument's internal state of sounds, programs, sequences, and so on.

This would have worked fine, except for the fact that the vendors and the MMA were not the only stakeholders. Just about this time (mid-80s) there began to rise a new type of music company: companies like Mark of the Unicorn, Steinberg Audio and Blue Ribbon Soundworks that created professional music software for personal computers. Some companies made sequencer programs that would allow you to enter musical scores on the computer screen and play them back through your Midi instrument. Other companies worked on a different type of product - a "Universal Librarian", essentially a computer program which would store all of your patches and sound programs for all your different instruments.

In 1987 I created a program for the Amiga called Music-X, which was a combination of sequencer and Universal Librarian. In order to create the librarian module, I needed to get information about all of the various vendor-specific protocols

Interrupt - as I was typing this last sentence, I knocked over my
glass of ice water onto my Powerbook G4, completely toasting the
motherboard and damaging the display. 24 hours, and $2700 later, I
have completed my "forced upgrade" and can now continue this posting.
Lesson to be learned: Internet rants and prescription pain meds do
not mix! Be warned!

...which was not that difficult, since most of the vendors wold include an appendix in the back of the users manual (generally written in very bad english) describing the SysEx protocol for that device. I was also able to get my hands on "The Big Midi Book of SysEx protocols", which was essentially the xerox of all of these various appendices, bound up in book form and sold commercially.

At the time there were approximately 150 registered vendor IDs, but my idea was that I wouldn't have to implement every protocol - I figured, since all I wanted to do was load and store the resulting information, I didn't really need to interpret the data, I just needed to store it. Of course, I would need to interpret any transport-layer instructions (commands, block headers, checksums and so on), since a lot of instruments sent their "data dumps" as multiple SysEx messages which would need to be stored together.

But I figured, since I was only supporting two vendor-specific commands for each vendor - bulk dump and bulk load - how different can they all be? Sure, there were likely to be individual variations on how things were done, but I could solve that by creating a per-instrument "personality file" - essentially a set of parameters which would tweak the behavior of my transport module. So for example, one parameter would indicate the type of checksum algorithm to be used, the second would indicate the number of checksum bytes, and so on.

For instruments that I couldn't borrow to test, I would rely on my users to fill in the holes (Ah, the heady optimism of the early days of the computer revolution!) and I would then add the user-contributed parameters to each update of the product.

I think by now you can start to see where this all goes wrong.

I started with a small set of 3 instruments, each from a different manufacturer. I analyzed their bulk data protocols, and came up with an abstract model that encompassed all of them as a superset. Then I added a 4th synth, only to discover that its bulk dump protocol was completely different than the previous three, and so my model had to be rebuild from scratch. No problem, I thought, 3 is too small a sample size anyway. Then I added a 5th synth, and the same thing happened. And a 6th. And so on.

For example, every vendor I investigated used a completely different algorithm for computing checksums. Some used CRCs, some did simple addition, others used XOR - and some had odd ideas of which bytes should be checksummed. Some of the algorithms were really bad too.

Different vendors also used different byte encodings. Because Midi is designed to work in an environment where cables can be unplugged at any moment, and because all other Midi messages (other than SysEx) were at most 3 bytes long, the Midi standard required that only 7 bits of each byte could be used to carry data, the 8th bit was reserved for a "start of new message" flag.

Different vendors adapted to this challenge with surprising creativity. Some would simply slice the whole dump into units of 7 bits each, crossing the normal byte boundaries. Some would only send 4 bits per Midi Byte. Some did things like: For each 7 bytes of input data, send the bottom 7 bits of each input byte as the first 7 bytes, and then send an 8th byte containing the missing top-bits from the first seven. And then there were those clever manufacturers who simply decided to design their instruments so that no control parameter could have a magnitude greater than 127.

Another example of variation was in timing. Roland machines (of certain models) were notorious for rejecting messages if they were sent too fast

you had to wait at least 20 ms from the time you received a message to the time you sent the response. Others would "time out" if you waited too long.

There were half-duplex and full-duplex, stateless and stateful protocols, and I could go on. The point is, that there was no way for me to come up with some sort of algorithmic way to describe all of these protocols - the only way to do was in code, with a separate implementation for each and every protocol. Nowadays, I'd simply embed Python into the program and make each personality file a Python script, but I didn't have that option back then. I toyed around with the idea of inventing a custom scripting language specifically for representing dump protocols, but the idea was infeasible at the time.

So, if you have had the patience to read through this long-winded anecdote and are wondering how in the hell this relates to Colin's question, I can sum it up in a very short motto (and potential QOTW):

"Never question the creative power of an infinite number of monkeys."

Or to put it another way: If you create a tool, and you assume that tool will only be used in certain specific ways, but you fail to enforce that limitation, then your assumption will be dead wrong. The idea that there will only be a few type annotation providers who will all nicely cooperate with one another is just as naive as I was in the SysEx debacle.

I'll have more focused things to say about this later, but I need to rest. (Had to get that out before all the rant energy dissipated.)

-- Talin

Previous message: [Python-3000] Draft pre-PEP: function annotations
Next message: [Python-3000] Draft pre-PEP: function annotations
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

More information about the Python-3000 mailing list