Bad application variables and the site redesign launch
I spent most of monday ripping my hair out trying to work out why the redesign of the Chicago Park District website was crashing each server systematically. The application is over 4 years old and the latest round of changes were to reskin the pages and add new content to the home page. After 4 hours of troubleshootizing :-) and load testing the application in a pre-production environment I found the issue.
Somehow, the application variables were being recreated with each request, in this case that included the rssHandler CFC responsible for pulling in the RSS feeds driving parts of the page. What I spotted was that the cfapplication tag typically set in the application.cfm was not set there but in fact later on the fbx_settings.cfm file after some environment specific code had been processed (am I in dev, stage, preprod or live). As soon as I changed the code around my load testing results pivoted sharply from 20 concurrent users killing the server to maxing out our load testing setup.
Some of the limitations you encounter when attempting your first load test:
- Client machine pipe size. Regardless of your load testing tool you are limited by the bandwidth connecting your client to the internet so you may struggle to simulate over 180 users remotely
- Client IP stack and socket limitations. A standard windows XP install typically only supports about 200 users before you see socket errors.
- Client software. Trial licenses of load testing software can limit the number of users you can simulate so make sure you understand the limitations of the tool.
- Web server pipesize. Even if your client has a fat pipe to connect to the internet, if your webserver is on a 10 mbps switch or is on a T1 it will restrict the number of users it supports. With 10 mbps it has to support inbound traffic, outbound traffic and if you have a database server on a separate machine (which you should) it will also have inbound and outbound requests there too. My suggestion would be to use a good SNMP graphing tool like Cacti to monitor each port on the webserver and your infrastructure so you can easily spot the point where your webserver has maxed out the pipe.
The load testing tools need not cost a small fortune and once you have a set of load scripts for a client, it helps with regression testing to make sure your funky new piece of code on the home page hasn't degraded the performance of your application to a crawl.
Microsoft has a free tool called Web Application Stress Tool (WAST) which we have used on a occasion to generate large loads. High end load testing tools typically have a master machine and then clients installed on multiple machines to distribute load over several IP addresses. For WAST, simply have some of your team members work at home or log on one evening for an hour, co-ordinate the tests over IM and all push the big red button when the test leader says go. This means 8 people can generate a load of 200x8= 1600 concurrent users.
WAST doesn't do very well at ramping up the number of users so we have licenses for Paessler's Load Test tool at $249 per seat which allow us to simulate slightly more users and create a better ramp up profile to warm up the servers before nailing them. It also has a better interface and, if your infrastructure supports it, the potential to generate up to 10,000 concurrent users per machine.
The other benefit of this low budget load testing approach is that your team are accessing the site from geographically diverse locations similar to the way your site will be accessed in the wild. This introduces elements like increased request latency and also means your load balancer tool (if you have one) is more likely to distribute the load more evenly than it would coming from the same IP address or local network, depending on your configuration.
I hope this was useful since I don't see much on the CF lists in the way of practical load testing on a budget. If anyone would like to see more on this topic let me know since I have done quite a bit over the last year while working with the Chicago Park District site. Once per quarter we run tests to ensure that the registration application section can support 3000 registrants in under 3 minutes.