Tuesday, June 23, 2015

A Stable and Maintainable automation framework (use Protractor for non Angular-JS web application)



A good automation framework should be:

  • Easy to write
  • Easy to maintain
  • Fast 
  •  Stable



The current automation framework used for the system I am working on fails all those criteria, it is not easy to read and maintain – one UI element change requires full text search to fix all impacted code; it is so slow that it is scheduled to be run only once per day; and it is not stable, when it fails, the usual reaction is, “let us wait for another run.”



I set out to rewrite the automation framework using Protractor. Protractor was designed for testing AngularJS web applications, my application was not written in AngularJS, so it took me a lot of trouble to make it fast and stable.

The resulting architecture is:

 

Test cases



The top layer is test cases, the focus on this layer is readability: it is readable to an even non-technical person.  The reason to stress readability is to make sure this framework can be leveraged to write  feature acceptance tests, a non-technical person (a product manager or a user) can read the test case and verify if it complies with the requirements.

My system is a ticket handling system, one test case reads like:




A non-technical person, even if he has no idea what the system under test does, can understand the test case.  

Domain objects

The middle layer is domain objects. Domain objects focus on troubleshooting.

Every domain method is about one business action. Some business actions are repeatable, for example, one can save a ticket multiple times; but some business actions are not repeatable, for example, after a ticket is assigned, it can’t be assigned to another person – it needs to be first unassigned.

The test case can fail at any business action, when it fails at one action, to troubleshoot what happens, you do not want to start all over again, you’d want to start from the failed business action. Because every business action is encapsulated into one method, you can comment out the previous successful methods, and start from the failed method.

Every business action works on a business object. Every method has two parameters; the first parameter is a map for defining properties of the business objects, the second parameter is a map for defining how to search for the business objects.

Domain objects work on the UI element, it locates UI elements through ID, name, xpath or text, and apply actions on them. The top layer of test cases does not know UI elements. 



To start from the failed business action, comment out the previous successful method, and for the fail method, add the second parameter to search for the business object:




With this architecture, over time, the domain objects will be enriched to cover most functions, and writing test cases will be like assembling domain objects, easy and efficient.


Element Handler



The bottom layer is the messiest one: it handles UI elements.  For AngularJS application, Protrator has a mechanism to wait for the page to be loaded and settled down (so I heard, I’ve never tried Protractor on AngularJS applications),  but my application was not written in AngularJS, so I have to work out on my own way to test if a UI element is ready to be worked on.

This layer has many methods to make handling UI element easy:


findByID
findByName
setTextByName
setTextByXPath
isElementByIdVisible
isElementByNameVisible
isElementByXPathVisible
isElementByXPathEnabled
     …


You might wonder why I have methods for different locators (byID, byName, byXPath etc), why I didn’t make the automation framework more tolerant, and try to locate an element using different locators. 

Since UI element changes frequently, on one release, one UI element’s name is “ticketNum”, on the next release, its name changed and now its ID become “ticketNum”. If the framework first tries to locate by name, and then by id, then the code doesn’t have to be changed. 

It is a good idea, but it slows down automation. Take clicking a button for example, to make clicking stable, it goes through these steps:

  1. First locate element by.Name(“”) (or by other locators)
  2. Keep polling if the element is visible until it is visible or timeout
  3.  Keep polling if the element is enabled until it is visible or timeout
  4. Click it
  5. Catch errors throwing out from 4), and repeat clicking



If the framework first tries by name, then by id, the time on the second step will be wasted. You might think such waste is ignorable for the sake of making the framework more tolerant, but my application is really clunky: it can fail for all kinds of reasons, so I like it to fail fast. 

The second reason is that all UI locating happens inside the domain objects, if UI locators change, usually only method inside one domain objects need to be changed, so the maintenance cost is really low. 

The third reason is that by making automation fast, I can use automation as stress test.

 Test Case Gluer


 Some test cases are inter-dependent: the business objects created by the previous test case is used in the following test cases. A Glue object is used to store such objects.

The Glue object is also used to record test cases’ starting time. Every case’s beforeEach method will invoke util.initCase to record start time:



Util.prototype.initCase=function(){


    var caseName=replaceSpaceWithDash(jasmine.getEnv().currentSpec.description);

    glue[caseName]=new moment();

    if(_.isUndefined(glue.starTime)){

        glue.starTime=glue[caseName];

    }
}

And every test case’s afterEach method will invoke util.finsihCase  to calculate and print out each case's run time as well as the total run time:


Util.prototype.finishCase=function(){

    var caseName=replaceSpaceWithDash(jasmine.getEnv().currentSpec.description);

    var endTime=new moment();

    console.log("################The run time for "+caseName +" is "+endTime.diff(glue[caseName], "minutes")+" minutes");

    console.log("################The total run time is "+endTime.diff(glue.starTime, "minutes")+" minutes");
}


Every test case takes this form:

describe('Automation', function () {

    beforeEach(
function () {
       
util.initCase();
    });
       
afterEach(function () {      
   
util.finishCase();
});
      }



Error Handling

Because the test cases are inter-dependent, if one test case fails, the rest should not be started. Protrator has an open request to track this issue: https://github.com/jasmine/jasmine/issues/414.
Before this is fixed by Protractor, I solved it in util.initCase:

Util.prototype.initCase=function(){

   
    browser.
ignoreSynchronization = true;

   
if(!glue.bailFastOptionSet){
        jasmine.Env.
prototype.bailFast = function() {
           
var env = this;
           
env.afterEach(function() {
               
if (!this.results().passed()) {
                   
env.specFilter = function(spec) {
                       
return false;
                    };
                }
            });
        };

        jasmine.getEnv().
bailFast();

        glue
.bailFastOptionSet=true;
    }


}


Upon failure, the framework will take a picture of the moment and store the picture, this is done in util.finsihCase:

Util.prototype.finishCase=function(){
    
var caseName=replaceSpaceWithDash(jasmine.getEnv().currentSpec.description);
    var passed = jasmine.getEnv().currentSpec.results().passed();
    var screenshotsDir = path.resolve(reportDir + glue.starTime.format('MMDD_HHmm'));

    var self=this;
    if (!passed) {

        co(function *(){
            if (!fs.existsSync(screenshotsDir)) {
                fs.mkdirSync(screenshotsDir);
            }
            

            var file = path.resolve(screenshotsDir + '/' + caseName + .png');
            browser.takeScreenshot().then(function (png) {
                console.log('Writing screenshot to file ' + file);
                fs.writeFileSync(file, png, {encoding: 'base64'}, console.log);
            }, console.log);
}