Spikes, Logs and the POWER CLAW Maciej Ceglowski, Pinboard, 2011-10 Source: http://blog.pinboard.in/2011/10/spikes_logs_and_the_power_claw/ Before launching Pinboard I created the usual ritual series of spreadsheets to try to anticipate traffic, data storage requirements, and revenue given a variety of scenarios. And like everybody else I have found these spreadsheets to bear little relationship to reality. The problem with trying to model this stuff is that we find ourselves in a domain where a small number of rare events can completely dominate the data. Here's an all-time graph of new Pinboard users per day, for example: ![img1] This chart shows user signups over time, but you'd see the same graph for every metric of interest - traffic, Twitter mentions, cups of coffee consumed by the developer. Just seven days account for half of all Pinboard revenue. The mechanics of this are fmailiar. Someone writes an article about you, or you're featured on a Top 10 list, or a meteor hits your competitor and very briefly the full attention of bored people on the Internet is yours. Somewhat counterintuitively, while the timing and size of the events is unpredictable, the overall pattern is regular. Here's what the same graph looks like if you rank all the days by number of new users and plot them on a log-log scale: ![img2] This kind of plot is the hallmark of the POWER CLAW. If your stats look like this, you can know with some confidence that your day-to-day experience will not prepare you for a few extraordinary days that will matter most for your project. You can also expect to spend most of your time grinding away, waiting for those extraordinary days to arrive. Rare events are rare - it says so right there in the name! Of course, many other people have noticed this phenomenon, and there's a terrific book devoted entirely to it. But there's something about human psychology that makes it very hard to internalize the idea. I'm still making the damned spreadsheets. If you run a web thing, please consider sharing your own log-log plot with the world, obfuscating whatever you need to to feel comfortable. I'm very curious about how many young sites see a similar traffic pattern, and whether it diminishes as you grow. --- * Back around 2003 the blog world became fascinated with power laws and exponential distributions due to a Clay Shirky essay. POWER CLAW has been my mental shorthand for this kind of beard-stroking ever since.