I’ve wanted to create and test a large-scale application for a very long time but have never really had the chance until recently. The Vintage Experience project I did earlier this year finally gave me the opportunity. As one of many parts of the project, I was tasked to create a voting system that could handle 1 million votes via a web page in a 30 second time span. The final system was deployed successfully without any problems for Gala Artis 2013 (a French Canadian artist/TV awards show). The following are the results of my implementation and testing.
The main front-end was done via a static HTML page (smart-phone optimized) that was hosted by Amazon S3, where handling 33k requests/second is a drop in the bucket. All voting requests were done via AJAX from this web page to backend servers hosted by Amazon EC2.
The backend was programmed in GoLang as a simple web server, optimized for memory and speed, which spawned a new goroutine for each incoming request. The request returned a message to the user stating either the error message, or success if the vote was added to the database. Each server held a single persistent MySQL connection to an Amazon RDS “Large DB Instance” with the minimum IOPS (1000). Votes from a server were sent to the database in batches once a second, or earlier if 10,000 votes had been received.
The servers were Amazon “M1 Standard Extra Large” (m1.xlarge) instances running Linux, of which there were 6 total, handling vote requests delegated by a round-robin DNS on Amazon’s Route 53. During stress testing, each server was found to be able to handle 6800 requests/second, and load was staying under 3, so there were was probably another bottle neck. Running the same tests using php(sapi)+apache(fork), only 4500 requests/second could be executed, and there was a 16+ load.
On the servers, I found it necessary to set the following sysctl setting to increase performance “net.core.somaxconn=1024”. The following commands need to be run to execute this:
sysctl 'net.core.somaxconn=1024' #Store for the current computer session
echo 'net.core.somaxconn=1024' >> /etc/sysctl.conf #Set after a reboot
Stress test client instances were also run on Amazon as m1.xlarge instances, and were found to be able to push 5000-6000 requests/second. The GoLang test clients spawned 200 requests at a time and waited for them to finish before sending the next batch. The client system needed the following sysctl settings for optimal performance:
net.ipv4.tcp_tw_recycle=1
net.ipv4.tcp_tw_reuse=1
net.ipv4.tcp_fin_timeout=30
net.ipv4.ip_local_port_range="15000 65534"