npm 3’s Flatter Module Tree Makes Babel Much, Much Faster

Recently, I became frustrated with the slow speed of my JavaScript build. After running some experiments, I discovered that the bottleneck was transpiling ES2015 code to ES5 code via Babel.

To illustrate this, consider an example project consisting of:

  • node_modules/, containing babel-cli@6.2.0 and babel-preset-es2015@6.1.18
  • index.js, containing console.log('hello');

What happens if we transpile this 1-line script?

$ time babel --presets es2015 index.js 
'use strict';

console.log('hello');


real        0m3.425s
user        0m3.194s
sys         0m0.272s

Three and a half seconds is a long time to do nothing. In fact, in both this fake project and in my real projects, transpiling was taking well over an order of magnitude longer than bundling and minifying. Why is the transpiling step so slow?

Babel has a handy debug mode that prints out each parsing step and the associated time. Maybe Babel is spinning on some parsing step or something?

$ DEBUG=babel babel --presets es2015 index.js 
babel [BABEL] index.js: Parse start +0ms
babel [BABEL] index.js: Parse stop +7ms
babel [BABEL] index.js: Start set AST +2ms
babel program.body[0] ExpressionStatement: enter +3ms
babel program.body[0] ExpressionStatement: Recursing into... +0ms
babel program.body[0].expression CallExpression: enter +1ms
... (snip) ...
babel program.directives[1].directives[0] Directive: Recursing into... +0ms
babel program.directives[1].directives[0] Directive: exit +0ms
babel [BABEL] index.js: End transform traverse +0ms
babel [BABEL] index.js: Generation start +0ms
babel [BABEL] index.js: Generation end +4ms

Nope, Babel is Babelifying reasonably fast. However, I also noticed that the three second delay was occurring between typing the command, and seeing the the first line of debug output appear. This is what we refer to as, “being hit with the clue bat.”

So I turned to dtrace and started looking at file access activity, which was an eye-opening experience. Instead of going into the gory details, I’ll illustrate the problem more succinctly by counting the files under ‘node_modules’:

find node_modules/ -type f | wc
    42929   42929 6736163

If Node has to open some appreciable fraction of these 43K files at startup… well, I ain’t no fancy Full Stack Engineer or nothin’, but that seems like a lot of file I/O.

Now for me at least, Babel is an indispensable development tool. I would give up minification before Babel. I would give up source maps before Babel. I might even give up vim before Babel. So how to make Babel faster?

One way would be to open fewer files.

I had been using trusty old npm 2, but npm 3 has a rewritten dependency management system, which is designed to produce an “as flat as possible” dependency tree. Which potentially means less file duplication.

So let’s throw away the npm 2 tree and install an npm 3 tree:

$ npm --version
2.14.3
$ sudo npm install -g npm 
Password:
/opt/local/bin/npm -> /opt/local/lib/node_modules/npm/bin/npm-cli.js
npm@3.5.0 /opt/local/lib/node_modules/npm
$ rm -rf node_modules/
$ npm install babel-cli babel-preset-es2015
... (snip) ...
$ find node_modules/ -type f | wc
    4495    4495  225968

And now for the moment of truth:

$ time babel --presets es2015 index.js 
'use strict';

console.log('hello');


real        0m0.683s
user        0m0.621s
sys         0m0.079s

Switching to npm 3 yields a 6x speedup. In my real projects, the total speedup for the entire build (including bundling, minification, and source map generation) is more like 3x.

Lessons learned: Computers are pretty fast. Opening enormous numbers of files, not so much.

Songs for Git Commands

git push

“Salt-N-Pepa — Push It”

Okay, this one was kind of a gimmie.

git commit

“Beyonce — Single Ladies (Put A Ring On It)”

If you wanted it, you shoulda put a hash on it!

git log

“Ren & Stimpy — Log”

It’s better than bad, it’s good.

git config

“Eminem — My Name Is”

Got other suggestions? Leave ’em in comments. (Extra points awarded for whatever mad genius comes up with something for git reflog.)

git reflog [new]

“Britney Spears — Baby One More Time”

Thank you, Allen!

git fetch [new]

“Queen — I Want It All”

Thank you, Josh Adell!

git rebase [new]

“Bad-CRC — All Your Base Are Belong To Us”

Thank you, Petey!

User Comments in Documentation: Stop It. Stop. It!

Today I took a look at WebPlatform.org, which (among other things) reminded me that man oh man, do I hate user comments in documentation.

If you want an example of how user comments degrade perfectly good documentation — well, there are many, but the most obvious one is PHP.net. PHP is a monster of a language, but thankfully PHP also has been blessed with a longtime dedicated documentation team that has built up a substantial and highly useful set of user guides and API reference documentation. The core documentation is solid, and the site’s use of redirects (typing php.net/functionname “just works”) is absolutely brilliant.

But if you scroll down to comments on any given page, you’ll see bad practices that should have died years ago, code that doesn’t do what the user thinks it does, people complaining about (imagined) bugs, and worse.

Wait, but people really like documentation comments.

It’s true that when you ask developers what they like about the PHP documentation, they often cite the comments. Engineers often decide that comments are a need-to-have feature for documentation sites. After all, it’s the community contributing to the documentation. Who doesn’t like community? Yay community!

Comments make people feel good. But comments are also just about the worst way possible to collect and incorporate feedback.

The key problem is that for documentation, comments are part of the product. It’s as if you built a web app by building the datastore, adding application logic, tuning the user interface… and then allowing any random user to modify the navigation bar any way they like for all other users.

Oh, I see. So you’re saying you hate the community and kittens?

No. What I am saying is that contributing feedback is a solved problem in software development. Make a goddamned pull request! Or send a patch. Or whatever you do normally. Treat your documentation source the same way you treat source code.

That’s setting the bar too high. We want feedback from a wide range of users.

Respectfully, if someone is writing technical documentation for software developers, and they haven’t grasped the basics of version control, they should not be allowed to directly contribute changes to your documentation. Period.

That said, feedback from less experienced users is still valuable. If someone notices something is wrong, or wants to add an example, or needs to point out a section that is confusing, there are plenty of ways to collect that feedback. These include:

  • Issue trackers
  • Email
  • Wikis
  • IRC
  • Social media
  • Meetups

Again, collecting feedback from users of all knowledge levels is a solved problem. Tools exist for this. Some of them are even pretty good!

In any case, please stop inventing new commenting systems and gluing them into your documentation sites. You are taking what should be a relatively straightforward generated/static site and turning it into a dynamic CRUD site for no good reason. A more colossal waste of engineering time I cannot imagine.

Well, okay, I can imagine quite a bit.

JSFABulous

After yesterday’s post, I heard from Jan Lehnardt. Jan, Tiffany Conroy, and Marijn Haverbeke are piloting a workshop named JSFAB (JS for Absolute Beginners) that, well, does what it says on the tin. The initial curriculum (source code) was written by Conroy and Haverbeke, who not coincidentally wrote the highly recommended Eloquent JavaScript.

JSFAB has a clever take on this problem: they’re using an in-browser sandbox and editor, and within that environment, they’ve made available some high level drawing functions. This means that students just learn programming principles and JS syntax by jumping in and drawing shapes and graphs right away. This approach neatly sidesteps many of the problems I mentioned in the last post, and seems perfect for “absolute beginners” who don’t know HTML or CSS. Later on, of course, you can go back and teach them how to rip apart and remake web pages; the key thing is that they’re on their way.

On a personal note, I believe that courses like JSFAB have great potential. We are starting to live in a world that is soaking in ambient computing power. Knowing how to program (even a little) means eliminating drudge work. It means taking more control over your world. So with that in mind, I’m really pleased to hear that the JSFAB crew have already had great success with their first workshop in Europe, and are iterating further. If you have feedback for them, I know for a fact they’d be happy to hear from you!

Backscatter Spam Attack from orange.fr

I’m currently suffering a backscatter spam attack, 95% of which is coming from orange.fr’s misconfigured email servers. 500 emails and counting. I’ve certainly seen worse (much worse), but not anything on this scale for many years. It feels kind of… retro.

Presumably the reason backscatter spam has grown rarer is that these days, most mail servers are now configured properly. And yet — it seems that at end of 2011, there are still major service providers that employ systems engineers and product managers who do not understand the basic principles of mail server configuration.

So help educate the fine people who work for orange.fr, here is an article that explains why you should never bounce spam and viruses. And here is the French translation of the same article.

You’re welcome, orange.fr systems engineers and product managers! I’m glad we were able to have this positive cultural exchange.

Well and Truly

  • I have just updated my phone to iOS 5 and said “yes” to iCloud.
  • My main home machine is still on Snow Leopard.
  • My wife’s home machine is also still on Snow Leopard, and can’t be upgraded until a critical piece of software she uses becomes Lion-compatible. Estimated time: six months.
  • My work machine is also still on Snow Leopard and won’t be upgraded until the IT department approves Lion. Estimated time: right around when 10.8 is out. Alternatively, I could follow the lead of many of my co-workers and just give up and buy a personal Macbook Air for all work computing needs. But that would require Brass Reproductive Organs, which clearly I lack.
  • Oh, my wife and I also use Mobile Me for syncing data and sharing calendars.
  • Not to mention that my life completely revolves around OmniFocus, and I use Mobile Me for syncing there as well.

When there is a big decision to be done, to be done, a policeman’s lot is not a happy one (happy one).

What Steve Yegge’s Platform Rant Tells Us

  1. Steve Yegge is a smart, interesting writer.

  2. The entire tech press ecosystem is utterly worthless.

Over the course of the day, the story has grown ever more encrusted with links. Each one just summarizes the original post (sometimes badly) without adding any useful analysis or commentary.

It’s baffling. I mean, If you look hard enough, you can find good crime reporting, good science reporting, and so on. So I genuinely do not understand why tech journalism rides the short bus. Particularly since there are people in my industry who are making decisions that involve large amounts of money and who would presumably benefit from having access to trenchant analysis. This seems like a market failure.

Perhaps all the good stuff is locked behind expensive, elite paywalls. Or perhaps the real action is all in the backchannel. My money’s on the latter. In any case, neither theory explains why the tech press is able to exist and perpetuate itself. Parasitism and AdSense can only take you so far.

Prescience

The scene: a demonstration in the mid-70s of an early multitasking OS at Xerox PARC:

To illustrate the flexibility of the system, the Xerox presenter clicked from a window in which he had been composing software code to another window that displayed a newly arrived e-mail message. He quickly read and replied to the message, then hopped back to the programming window and continued coding. Some in the audience applauded the new system. They saw that it would enable people to use their computers much more efficiently. Others recoiled from it. “Why in the world would you want to be interrupted — and distracted — by e-mail while programming?” one of the attending scientists angrily demanded.

— Excerpted from Nicholas Carr’s The Shallows. Highly recommended.