Archive for September 2008

 
 

You’re Awesome! Have my Pocket Protector!

My Rockstar Gift

My Rockstar Gift

As a kid playing my Gibson SG in a rock band, I had dreams of being a rockstar. I used to practice for hours with a friend, Keith Howland. Keith was far better than me, his fingers moving across his Les Paul like liquid lightning, playing “Freebird” to the swaying masses. I’d strum along and do my best to not lose the beat. That was then. Today Keith is a rockstar, playing for Chicago. I’m a CTO/VPE for early stage companies, like Photobucket and Heavy.

Way too many years later, I had my first rockstar experience. Its not what I had in mind. VC Mike from Polaris introduced me as a “rockstar CTO” in Boston’s Amazon Web Services meetup. That’s quite humbling, especially from Mike, who I deeply respect. As the Rolling Stones say so poetically, you can’t always get what you want, but you get what you need.

I was scheduled to talk on migrating from managed hosting to Amazon’s new web services. A few minutes before I was called to the podium, I was sitting next to an engineer who specialized in SEO. We chatted for a bit, exchanging backgrounds, talking shop. He handed me his card and I stuffed in it my shirt pocket. He looked at me, puzzled.

“If you’re going to give a talk, you need to do it right. I’ve been doing some research on membranes to optimally protect your pens and cards you collect, without causing your Listerene strips to melt from body heat. I have some rather interesting results with different composite materials.”

What the… this guy reaches into his bag, shuffles around. His hand emerges, holding a flimsy piece of transparent plastic. “Ah, this is the one! Here, you need this.”

He hands me a pocket protector. I gave my talk. He came to see me afterwards. “Loved your talk, Scott! Please keep the membrane as my gift.” I was floored.

Note to VC Mike: Rockstars have screaming girls throw underwear on stage and paper airplanes with phone numbers written in lipstick. What do I get? A limp pocket protector. I can only hope Keith is faring better with his fans.

Amazon AWS vs. Rackspace and Akamai

[Ed: This post is a paraphrased transcription of my talks at the Amazon Web Services meetups in New York and Boston last week.]

Last summer I met David Carson, the colorful and creative cofounder of Heavy. We shared a burger at a nondescript hotel in Westchester. While I was munching on french fries, David began to describe his model for making money with video. Instead of selling ads directly in the video stream, his team was selling an enormous ad that surrounded the video. He called this a “video skin.”



 

I asked David how many videos he had. How many videos use this skin?

“Oh, about 20,000.”

That was nothing. We’d upload that before breakfast at Photobucket. Yet I was curious. “How much money are you making with these skins?”

David paused, looked at me. “I’d say somewhere between $10 and $20 million this year.”

I nearly choked on my cheeseburger.

That’s incredible! Here was this guy making millions out of a fraction of the videos we were producing at Photobucket. If this was true, and indeed it was, he had a real shot at making big bucks. Polaris wanted to turn David’s ad model into a network, enabling other video producers to profit, helping advertisers reach tens of millions of viewers… not just the kids watching skateboard stunts and girls on Heavy. A few weeks later I signed on as CTO.

Within a month I saw some real challenges under the hood. Yes, they were indeed making tons of cash off these videos. The revenue was very lumpy. Dave would make a killing during the Holiday season, when retailers would reach out to customers seeking the latest gizmo for their kids. During the dark winter, he’d suck wind. Revenue would dry up.

This wouldn’t be so bad, except as CTO I was faced with six figure monthly bills from Rackspace and Akamai. That was fine when revenue flowed. It was a disaster for the books when revenue dried up. Heavy’s newly hired CFO looked across his desk at me, frowning, saying I had to “do something” about these costs.

My team was excited about the new Amazon web services. Having worked with our own machines for years, I too was curious, but had no real experience with it. It seemed like the natural evolution of hosting, something we had dreamed about at IBM in the late 90’s.

The chart you see here tells the whole story. The Y axis shows the amount of bandwidth being generated. The X axis is time. The red color represents bandwidth from our Rackspace account, feeding external pipes. The yellow color represents bandwidth from our S3 buckets into the web.

First thing we did was fix our process for building software. Heavy was replete with producers, the kinds of people who love to see an entire script, to read something end-to-end, to ensure that everything has been covered. Those complete, detailed, hundred page “specs” work great for television and movies. Its horrible for software.

We switched to an agile process, akin to what some new friends at Pivotal Labs are espousing. That allowed us to be more flexible with what we build, when, tuning and adjusting every week, committing to deliverables in 4-week sprints.

Next, we called our providers. We told Rackspace that we’d likely adjust our monthly commitments as we were “going into the cloud.” I think they were snickering at us in Texas. The cloud was a lot of hype when we started.

Then we got started. I didn’t have a lot of extra budget. In fact, my budget was cut, and had to cut further. We had to squeeze what we’re doing into the current spend. We looked at our architecture, poured over traffic logs, and realized that a lot of traffic was being served from our origin servers that should be served from the edge.

After a bit of wrangling and reswizzling our backend to be more REST-like, we chopped up our search results, browse results, channel results, and more into bite-sized pieces with finely tuned caching rules. We deployed this technology in late February, which resulted in the first drop (1) in cost.


We used the savings from this architectural work to fund our Amazon experiments. Note how the chart starts to show a sliver of yellow at (2). That’s the early experiment with S3, where we were writing Bash scripts, PHP classes, and test clients in Flash and JavaScript to pull videos from S3. After a month or so of testing, we ran a pilot. Here we took our videos that were created in 2006 and served them directly from S3 instead of Rackspace.

The pilot worked. We built up confidence, held a meeting, and looked each other in the eye.

“Are we ready guys?”

“Sure, why not.”


We threw down the hammer (3). We fired up our cron scripts and started migrating our videos en masse to S3. The result was dramatic. Within days the bandwidth dropped by over 90% from Rackspace, where transfers were costing us $1.00 per Gigabyte, and storage was eating us alive at $8.00 per Gigabyte per month on a high-end SAN. Our new cost structure was $0.45 per Gig to store the videos, and a measly $0.17 to deliver a Gig into the Internet. The $0.45 was “high” as we kept three different copies of our videos on three availability ones.

This is when I received my first unsolicited call from Rackspace. Their IPO was imminent. An executive called me and wanted to hear more about what we’re doing. He said they’d be releasing cloud services soon. The account team later called me and offered to drop the storage price by more than 75%, down to $1.50 per Gig, “if I’d just keep what storage we had at Rackspace.”

Heavy’s CFO was delighted (well, as delighted as a bean counter can be, which is actually a half-smirk, almost a smile). Heavy’s CEO couldn’t believe the savings. We actually ripped 90% of our cost out of the monthly bill!

I’ll admit it. We were getting a little ahead of ourselves, feeling a tad overconfident. We decided our next target should be Akamai. Let’s attack that bill! Let’s knock a six-figure
bill down to the low thousands!

We fired up more cron scripts and turned our “suck-o-meter” all the way up. The “suck-o-meter” was a device that took a percentage of our traffic from high-end CDN deliver at Akamai to storage “that sucks” in a cloud S3. We assumed hardware was terrible, probably SATA drives in aging 1U PCs.

Initially this worked, which gave us the big spike (4). My Amex bill from Amazon was over $8,000 that month! Within days, however, we were getting calls from advertisers. Our head of product came to see me, as did our CEO. All were complaining about choppy videos.

Sure enough, we ran tests and discovered that S3 delivery was choppy. The chart here shows the bandwidth throughput of a connection to S3. Our videos were encoded at 500 kilobits per second. That’s about midway through the choppy part of the graph. We had to be in solid green for the videos to deliver smoothly. In fact, we saw that some S3 connections were slower than a 1993 modem at 56 kilobits per second! Other times we’d get Akamai quality at 1000 kilobits and more.

We turned the suck-o-meter back to zero, leveling out our costs. We’d live to tackle CDN costs another day. That’s a future post. Amazon just released the CDN private beta, and we were invited to join!

Wicked-fast geographic targeting


I’ve long been an admirer of Maxmind, a company that provides a free database for mapping IP addresses to geographic locations. Its more than sufficient for most applications, and I use it frequently in side projects as well as my day job at companies like Heavy and Photobucket.

Yesterday I paid $15 to buy a new version of their software that maps IP addresses to company names and known “proxies” used by warez providers. We need this to start tracking click fraud at work. What surprised me most was the release of an Apache 2.0 plugin using apxs, where the geographic lookups are now handled in C vs. a higher level scripting language. The performance improvement is dramatic. Further, it makes lookups dirt simple. The plugin stores the geographic IP, city, state, company and more in environment variables and internal Apache tables. It took me about 30 minutes to set everything up, including a compile and install on a CentOS box running on Amazon’s EC2.

The end result is a set of C libraries in /usr/local/lib and data files in /usr/local/shared that can be easily integrated to your C application:

  1.  
  2. #include <GeoIP.h>
  3. int main (int argc, char *argv[]) {
  4.   GeoIP * gi;
  5.   gi = GeoIP_new(GEOIP_STANDARD);
  6.   printf("code %s\n",
  7.     GeoIP_country_code_by_name(gi, "yahoo.com"));
  8. }
  9.  

Animated GIFs and video thumbnails from YouTube videos

A few weeks ago I was in a meeting, and my CEO kept saying “just need more cowbell.” People would laugh. What the hell? I had no idea. I missed the memo.

Googling around I found the original skit. It made me want to clip out the little phrase, and keep it handy for sending around the office. One thing led to another and I wrote a minisite while commuting, Tube Chopper. It takes snapshots, avatars, mp3 clips from YouTube videos.

I whipped out my Mac and showed friends (in their 40’s) the original skit over wine and appetizers. Cheese almost came out of their noses.

Make your own Youtube clips. Any loss of productivity is totally intentional.

Perez: All ads, no content?

Perez Hilton.  Content free edition.

Perez Hilton. Content free edition.

I’ve admired Perez Hilton’s amazing rise from rags to riches. He entertains millions of people, makes them feel giddy or warm for a few seconds, providing enormous value (in aggregate) for people during tough times. He deserves every penny.

But now he’s jumped the shark. I went to his website today. It looks like he cut a sponsorship deal with a mecca of young readers, MTV. Yet the sponsorship obliterated all content.

Now THAT’s an idea. Turn a famous blog into a single ad. Its like a thick Vogue magazine, with no articles whatsoever.

OK maybe its a bug.

Pinch effect in ImageMagick

Rihanna with a pinch.fx headache

Rihanna with a headache

Here’s another ImageMagick hack inspired by PhotoBooth on my Mac. Create a file named “pinch.fx” with the following content, without line numbers:

  1. kk=w*0.5;
  2. ll=h*0.5;
  3. dx=(i-kk);
  4. dy=(j-ll);
  5. aa=atan2(dy,dx);
  6. rr=hypot(dy,dx);
  7. rs=sqrt(rr*200);
  8. px=kk+rs*cos(aa);
  9. py=ll+rs*sin(aa);
  10. p{px,py}

As with the fisheye effect, we first compute the polar coordinates of the current pixel in lines 1-7. Next, I compute a scaled radius as the square root of the current radius. This forms a cone. Now, I project the pixel corresponding to the cone pixel (px,py) onto the current pixel (i,j) in lines 8-11.

Let’s have fun with Rihanna. First I grabbed a normal looking image:

Rihanna, as we know her

Rihanna, as we know her

Now apply

  1. convert -fx @pinch.fx rihanna.jpg pinch.jpg

Fisheye effect in ImageMagick

A fisheye view of Obama

A fisheye view of Obama

ImageMagick is my favorite tool for messing around with images.  At PhotoBucket, we dedicated dozens of machines to “mogrify” uploaded images into thumbnails and appropriate sizes for user accounts.  I use it on my pet projects like an image zoomer and a youtube clipper.

Recently I discovered the “-fx” or special effects options of the ImageMagick toolset.  This was written in 1996 as probably a neat hack.  The code interprets a calculator-like FX language for each channel of every pixel.   This is painfully slow and screams for optimization.  Yet its so much fun. Take the following snippet of FX:

  1. kk=w*0.5;
  2. ll=h*0.5;
  3. dx=(i-kk);
  4. dy=(j-ll);
  5. aa=atan2(dy,dx);
  6. rr=hypot(dy,dx);
  7. rs=rr*rr/hypot(kk,ll);
  8. px=kk+rs*cos(aa);
  9. py=ll+rs*sin(aa);
  10. p{px,py}

Variable names are tricky. Several characters are reserved to represent pixels (p), various channels (r, g, b), locations (i,j), sizes (w,h) and more.

Lines 1 and 2 of the FX code calculates the center of an image in x,y coordinates as (kk,ll).

Lines 3 through 6 compute the polar coordinates of the current pixel i,j as a radius rr and angle aa. The internal function hypot() is a shortcut for hypotenuse of a triangle, which is the Euclidean distance between points (kk,ll) and the current pixel (i,j).

Now comes the silly part. I take the radius of the polar coordinates and square it, then divide by maximum radius of the image, for a new radius RS for “radius scaled” in Line 7. From there, I calculate the cartesian coordinate that correponds to a pixel at radius RS from the center, at the original angle AA, in lines 8 and 9. This creates a 2d bell-shaped surface that is flat at the edges curves up toward infinity at the center. Finally, Line 10 stores the pixel located at (px,py) in the current pixel (i,j) using the “p” operator.

I store this equation in a file, “effect.fx”. I downloaded a picture of Obama as obama.jpg:

Obama as we all see him

Obama as we all see him

Next I applied convert using the fx operator:

  1. convert -fx @effect.fx obama.jpg fish.jpg

This takes a few minutes on my MacBook pro. The result is a fisheye view of Obama! Fun hack, with apologies to Obama.

“Its just coding”

This is a blog about building software.  Its what I do for a living, for a hobby.  To me, great software is still an art form, finessed through solid engineering practices.   Yet it seems to be lost in the corporate world of outsourcing.  If I hear another executive say, “its just coding,” I’ll cringe.