I think we are ready to run a Ludum Dare now! (Oops!)
As you can imagine, Phil and I had an INSANE couple days pounding the website in to submission. Here’s a re-telling of what happened.
In the days leading up to the event, we got the first notice from our $10/mo shared webhost that traffic was getting intense. No problem, at their request we installed caching. Overall we are running slower, but we made sure to point everyone to IRC and Twitter to get the theme.
The caching and redirection of traffic helped us get through the theme announcement (503 people on IRC), but come Saturday, our shared host informed us we were using way too much CPU (25% of a 8->16 core server… oops!), shutting the /compo/ site down.
We scrambled, panicked, and got the site up and running on a $60/mo VPS by working through the night. That still wasn’t doing it though, but it was 5/6 AM, and we both needed a recharge. We left the keys to the car with Seth, and crashed.
Some 3 hours of sleep later, Phil got up and switched us over to a $200/mo VPS. I was still asleep during this time, missed the memo, and had my own little site panic moment. But all was well, we just needed some tweaking.
Yet performance was still degrading, and fast! The submission system had nearly 100 submissions in it, so the site was certainly working well enough for people to send stuff in. Slowly.
Then came and went the mad rush of the final 5 hours… What was going on!? It just seemed unreal, how the heck were we bringing a $200/mo server to its knees already!
Hey! We’re on a (semi) dedicated server now, with 90% dynamic and generated content… AXE THE CACHE. Duh! Regenerating cached pages EVERY TIME someone posts is INTENSE! We disabled all caching, and it was bliss. A period classic Ludum Dare site performance of days yonder as we browsed. This fairy tale ended quickly though, leading me to have to track down the right settings to get browsers correctly locally caching images (by default, this isn’t enabled). After much tweaking, and a couple crashes, we got this sorted.
How else can we lighten the load? The sidebar! Good idea! Kill that! How’s that? Not enough! Okay, non-essential plugins off too!
But it just wasn’t enough. GRR!!
It took us installing an SQL query profiler on WordPress to finally discover the killing issue: The Submission System. Remember when I said we had nearly 100 entries already? Well this kept going up as time passed. I was recording some submission metrics for my own amusement before I noticed it. Some 300′sh entries, and WHY THE HECK ARE THERE 760 SQL QUERIES BEING EXECUTED FOR THE VIEW ENTRIES PAGE!?
That there, friends, was our zero hour site killer.
I’d like to think there was some crazy light-show going on at Phil’s place, as he coded as fast as he could, on 3 hours sleep, to fix the submission site code from executing O(n) queries to just O(1).
And there you have it. We still needed a new server, but performance got worse and worse as more and more games got finished. Really, we couldn’t have solved this without you guys; Braving the super-slow website to get your entries in. Sometimes it takes 760 queries in your face to see these sort of things.
Trial by fire. We got burned, but we hope the rest of you turned out alright.
Wow. Maybe I should be entering the LD website in the compo? Oh wait, it was a team effort, so that goes in the Jam.
Thanks everyone for your patience!
PS: Yes, the server costs have increased 20 fold overnight. We’ll be looking in to how to deal with that, once we have more rest behind us.