Friday, October 19, 2007

Distributing Selenium Tests

One of the nice things about using Selenium is that it runs in the browser, which allows you to test multiple browsers using the same test suite. The problem with doing this is that it is time consuming for any test suite of significant size. The best way to solve this problem is to run the test suite on multiple browsers in parallel. This greatly cuts down on the amount of time that it takes to run the tests while still giving you the same coverage.

Generally speaking there are two ways to parallelize the execution of Selenium test suites, one is to use the Selenium Remote Control and the other is to use the resultsUrl post back. The way that the Selenium Remote Control works is to allow the execution of Selenium commands on a remote server one command at a time. This involves installing a small server-type Java program on all of the machines that you want to remotely run Selenium commands on. Then authoring a program (or using one of the pre-baked ones) to connect to one of the remote machines and execute a Selenium test one command at a time, so the execution loop looks like this:
  1. Connect to remote server
  2. Load Selenium test file
  3. Pop command off of the top of the Selenium test file stack
  4. Send command to remote server
  5. Wait for response
  6. If fail, then fail
  7. else goto 2
The Selenium Remote Control server has nice hooks build into it that allow you to control almost any browser on most platforms, so it is straight forward to have one test be run on multiple machines using multiple browsers at once.

At CustomInk this was the first approach that was tried, executing one command at a time through custom Selenium test files. It is highly customizable because you are, more or less, developing everything from scratch. The Selenium Remote Control gives you a basic framework on which to develop a broader system but the actual execution and coordination of the tests is up to the client program.

After playing around with this approach it was not used at CustomInk for two main reasons. First, the fact that the Selenium Remote Control sends off one command at a time is a very fragile way to do things. If there are any timing problems on either the client or the server, or the client just gets bogged down for whatever reason, the tests will fail spectacularly. Second, developers needed to be able to use individual tests or sets of tests to test changes that they were making, much like unit tests are used. At CustomInk Selenium is not only used as part of QA but also as part of the development process to keep code stable and robust. The fact that running things locally and running things in the distributed environment end up working differently, due to both different test formats (one is html tables and the other is something else) and the fact that they execute so differently, one locally in browser and the other through the Remote Control, turned out to be a problem. Both because it was necessary to maintain different file formats and because things began to get hard to reproduce in both environments so the ability to use Selenium as a debugging tool was decreased.

The other approach is to use the resultsUrl mechanism to build a distributed method for executing Selenium tests on multiple machines over multiple browsers. The resultsUrl method is a technique by which a whole test suite is run to completion on a remote server and then the results of all of those tests are posted back to a central location. The advantage of doing this is that it is both more stable, since entire suites are running remotely rather than one command at a time so the communication coordination is much less between client and server and the fact that the same tests that developers are running locally in their browsers are being executed remotely in a browser. This leads to only a single test format and the ability to reproduce problems much more easily.

This technique is still built on the Selenium Remote Control, so there is a server running remotely and a client program that sends Selenium commands remotely to the server. The difference is the only command that the client sends to each server is to open and execute a whole test suite, using a specific browser, deployed on a staging server passing in a parameter called results resultsUrl. The resultsUrl parameter tells Selenium Core to post the results of the test suite execution back to that URL. An example of the URL that the Selenium Remote Control client would tell the server to open, in a specific browser, would be: http://staging/selenium/TestRunner.html?test=tests&auto=true&resultsUrl=http://seleniumhub?execution_number=53. The auto parameter tells Selenium Core to simply execute the test suite automatically and the resultsUrl tells Selenium Core to post the results back to that URL. So the execution loop would look like this:
  1. Connect to remote server
  2. Send command to open the URL http://staging/selenium/TestRunner.html?test=tests&auto=true&resultsUrl=http://seleniumhub?execution_number=53 in IE6
  3. Client code finishes
  4. On the server, IE6 opens and runs the test suite to the end
  5. The results of the tests are posted back to the URL specified through resultsUrl
This approach involves much less communication between the client and server portions of the Selenium Remote Control, meaning that there are less timing and network issues, but it also involves having the tests deployed on some centralized server as well as having some URL that can collect and display the results of individual tests. Here is a how the all of the pieces work together:

Since all of the code that Selenium is testing, including the test suite itself, is hosted on the Staging Box, reproducing problems is straight forward. Anyone can point their browser at the Staging Box and run the exact same test as the distributed test suite just ran.

The piece of code that resultsUrl posts back to can be very simple. Here at CustomInk we have a simple Rails project that just accepts results (along with the execution_number) and stores all the results with the same execution number to disk. It then allows people to go in and browse the results from the whole distributed test suite by execution number. CustomInk calls this the SeleniumHub. The downside of this is that it is code that you have to develop yourself, the upside is that it isn't very complex code.

CustomInk also integrated the execution of these tests with Cruise Control RB. This is straight forward to do but a script has to be written to kick of the tests and then return passed or failed based on the results that come back. The biggest problem with doing this is that a complete integration test suite built in Selenium normally takes to long to execute to be running in a continuous way. To get around this problem it is straight forward to create "Smoke Test" test suite that runs a subset of the whole suite and can run in a reasonable amount of time. Again, this is not difficult and allows our Continuous Integration Server to not only run our unit tests but also to run a

Overall this approach has proven to be robust and repeatable. The largest problems occur when individual tests having timing issues built into them, especially if they are testing AJAX applications, and tests fail intermittently. This is not a problem with this approach however, but rather a problem with the your tests.

Tags: , ,,
, , , , ,