Dropbox dives into CoffeeScript (original) (raw)

During July's Hackweek, the three of us rewrote Dropbox's full browser-side codebase to use CoffeeScript instead of JavaScript, and we've been really happy with how it's been going so far. This is a controversial subject, so we thought we'd start by explaining why.

CoffeeScript:

By the way, the JavaScript has a scoping bug, did you catch it??

We've heard many arguments against CoffeeScript. Before diving in, we were most concerned about these two:

That it adds extra bloat to iterative development, because each tweak requires recompilation. In our case, we avoided this problem entirely by instrumenting our server code: whenever someone reloads a Dropbox page running on their development server, it compare mtimes between .coffee files and compiled .js equivalents. Anything needing an update gets compiled. Compilation is imperceptibly fast thanks to jashkenas and team. This means we didn't need to change our workflow whatsoever, didn't need to learn a new tool, or run any new background process (no coffee --watch). We just write CoffeeScript, reload the page, loop.
That debugging compiled js is annoying. It's not, and the main reason is CoffeeScript is just JavaScript: it's designed to be easy to debug, in part by leaving JavaScript semantics alone. We've heard many arguments for and against debuggability, and in the end, we convinced ourselves that it's easy only after jumping in and trying it. We converted and debugged about 23,000 lines of JavaScript into CoffeeScript in one week without many issues. We took time to test the change carefully, then slowly rolled it out to users. One week after Hackweek had ended, it was fully launched.

Probably the most misleading argument we hear against CoffeeScript goes something like this: If you like Python or Ruby, go for CoffeeScript — it's really just a matter of syntactic preference. This argument frustrates us, because it doesn't consider history. Stick with us for a minute:

April 1995: Brendan Eich, a SICP enthusiast, joins Netscape with the promise of bringing Scheme to the browser.
He's assigned to other projects in the first few months that he joins. Java launches in the meantime and explodes in popularity.
Later in '95: Scheme is off the table. Upper management tasks Eich with creating a language that is to Java as VBScript is to C++, meant for amateurs doing simple tasks, the idea being that self-respecting pros would be busy cranking out Java applets. In Eich's words:_JS had to "look like Java" only less so, be Java's dumb kid brother or boy-hostage sidekick. Plus, I had to be done in ten days or something worse than JS would have happened._Imagine Bruce Campbell Brendan Eich as he battled sleep deprivation to get a prototype out in 10 days, all the while baking his favorite concepts from Scheme and Self into a language that, on the surface, looked completely unrelated. LiveScript is born. It launches with Netscape Navigator 2.0 in September '95.
December '95: For reasons that are probably marketing-related and definitely ill-conceived, Netscape changes the name from LiveScript to JavaScript in version 2.0B3.
August '96: Microsoft launches IE 3.0, the first version to include JavaScript support. Microsoft calls their version "JScript" (presumably for legal reasons).
November '96: ECMA (Now Ecma) begins standardization. Netscape and Microsoft argue over the name. The result is an even worse name. Quoting Eich, ECMAScript "was always an unwanted trade name that sounds like a skin disease."

Especially considering the strange, difficult and rushed circumstances of its origin, JavaScript did many things well: first class functions and objects, prototypes, dynamic typing, object literal syntax, closures, and more. But is it any surprise that it got a bunch of things wrong too? Just considering syntax, things like: obscuring prototypical OOP through confusingly classical syntax, the var keyword (forgot var? congrats, you've got a global!), automatic type coercion and == vs ===, automatic semicolon insertion woes, the arguments object (which acts like an array except when it doesn't), and so on. Before any of these problems could be changed, JavaScript was already built into competing browsers and solidified by an international standards committee. The really bad news is, because browsers evolve slowly, browser-interpreted languages evolve slowly. Introducing new iteration constructs, adding default arguments, slices, splats, multiline strings, and so on is really difficult. Such efforts take years, and require cooperation among large corporations and standards bodies.

Our point is to forget CoffeeScript's influences for a minute, because it fixes so many of these syntactic problems and at least partially breaks free of JavaScript's slow evolution; even if you don't care for significant whitespace, we recommend CoffeeScript for so many other reasons. Disclaimer: we love Python, and it's Dropbox's primary language, so we're probably biased.

An interesting argument against CoffeeScript from Ryan Florence, that seemed plausible to us on first impression but didn't hold up after we thought more about it, is the idea that (a) human beings process images and symbols faster than words, so (b) verbally readable code isn't necessarily quicker to comprehend. Florence uses this to argue that (c) while CoffeeScript may be faster to read, JavaScript is probably faster to comprehend. We'd expect cognitive science provides plenty of evidence in support of (a), including the excellent circle example cited by Florence. (b) is easily proven by counterexample. Making the leap to (c) is where we ended up disagreeing:

For the most part CoffeeScript isn’t trading symbols for words — it’s dropping symbols. Highly repetitive symbols like , ; {} (). We believe such symbols mostly add syntactic noise that makes code harder to read.
CoffeeScript introduces new symbols! For example, (a,b,c) -> ... instead of function (a,b,c) {...}. Along with being shorter to type, we think this extra notation makes code easier to comprehend, similar to how math is often better explained through notation instead of words.
Consider one example where CoffeeScript does in fact swap a symbol for a word: || vs or. Is || really analogous to the circle in Florence’s example, with or being the verbal description of that circle? This needs the attention of a cognitive scientist, but our hunch is || functions more linguistically than it does symbolically to most readers, acting as a stand-in for the word or. So in this case we expect something more like the reverse of the circle example: we think || and or are about equally readable, but would give slight benefit to CoffeeScript’s or, as it replaces a stand-in for or with or itself. Humans are good at mapping meanings to symbols, but there’s nothing particularly _or_-esque about ||, so we suspect it adds a small amount of extra work to comprehend.

On to some code samples.

We'll let this comparison speak for itself. We consider it our strongest argument in favor of CoffeeScript.

| | JavaScript | CoffeeScript | | | ----------------- | ------------ | ----- | | Lines of code | 23437 | 18417 | | Tokens | 75334 | 66058 | | Characters | 865613 | 65993 |

In the process of converting, we shaved off more than 5000 lines of code, a 21% reduction. Granted, many of those lines looked like this:

Regardless, fewer lines is beneficial for simple reasons — being able to fit more code into a single editor screen, for example.

Measuring reduction in code complexity is of course much harder, but we think the stats above, especially token count, are a good first-order approximation. Much more to say on that subject.

In production, we compile and concatenate all of our CoffeeScript source into a single JavaScript file, minify it, and serve it to browsers with gzip compression. The size of the compressed bundle didn’t change significantly pre- and post-coffee transformation, so our users shouldn’t notice anything different. The site performs and behaves as before.

Rewriting over 23,000 lines of code in one (hack)week was a big undertaking. To significantly hasten the process and avoid bugs, we used js2coffee, a JavaScript to CoffeeScript compiler, to do all of the repetitive conversion tasks for us (things like converting JS blocks to CS blocks, or JS functions to CS functions). We'd start converting a new JS file by first compiling it individually to CS, then manually editing each line as we saw fit, improving style along the way, and making it more idiomatic. One example: the compiler isn't smart enough to convert a JS three-clause for into a CS for/in. Instead it outputs a CS while with i++ at the end. We switched each of those to simpler loops. Another example: using string interpolation instead of concatenation in places where it made sense.

To make sure we didn't break the site, we used a few different approaches to test:

Jasmine for unit testing.
We built a fuzz tester with Selenium. It takes a random walk across the website looking for exceptions. Give it enough time, and it theoretically should catch 'em all 😉
Tons of manual testing.

Dropbox now writes all new browser-side code in CoffeeScript, and we've been loving it. We've already written several thousand new lines of coffee since launching in July. Some of the things we're looking to improve in the future:

Browser support for CoffeeScript source maps, so we can link JavaScript exceptions directly to the source code, and debug CoffeeScript live.
Native CoffeeScript support in browsers, so that during development, we can avoid the compilation to JavaScript altogether.

To Brendan Eich and Jeremy Ashkenas for creating two fantastic languages.