Tuesday, January 28, 2014

Writing selenium tests with style!

I know it's been more that a while sine I last blogged but there's actually a good reason for it: I've been reviving a legacy application sentenced to death by decommission for the better part of the last year. With success I might add :) Now everybody wants a piece of it - but that's not what I wanted to talk about...

During this exercise beside obvious technology updates like switching to Tomcat from a God forsaken version of some other application server and making the project actually compile in a continuous integration environment, yata yata yata,  I've fiddled a little bit with selenium tests to make them at least do a sanity check while doing all those naughty changes. In doing so I've realized that actually Selenium can be a real beauty if used properly.

Without further due let's jump into the meat!

The problem

Selenium tests are integration tests and they are hard to write and maintain

The solution

This is actually not so hard and if you structure your tests properly, apply some rules and enforce a couple of design decisions then writing those tests is actually extremely easy and fun!

Let's see how we can turn the beast into a beauty!

Originally the recipe that you get when you arrive to Selenium's getting started page presents you with the basics (and for a good reason): create the driver, navigate to a page, search for an element, interrogate the element for some information. This works really nicely if all you want to do is automate Google's search engine and nothing else.

If you want to build some structure around the pages you test the Page Object design pattern comes to the rescue. Unfortunately the original documentation fails to mention the @FindBy annotation which is crucial for making the code look good and perform as expected.

Let's take a look at the following: we have an application (a hello-world style one) that presents one heading on the page with the text "Hello, world!". To describe the page using the @FindBy annotation simply declare the field as a WebElement and annotate it with @FindBy(id = "greeting")

public class HomePage {
    @FindBy(id = "greeting")
    public WebElement greeting;

Now to initialize such a page object instantiate it as you'd normally do

HomePage page = new HomePage();

and later on use the PageFactory.initElements(driver, page) method to initialize a set of proxies to elements. The "proxies" element is crucial: they don't need to appear on the page up front and you get all the usual stuff like waiting for them to load absolutely for free! It's like magic - only better :)

The Problem

Applications have structure and I need to repeat lots of elements

The Solution

Applications have structure. Their individual pages are not disconnected from each other, each presenting a totally different content. Usually pages are contained in some form of layout that'll act as a common experience for the user. Luckily for us we have decided to describe pages as classes and classes can do ... inheritance!

Let's say the HomePage is of some system that has a layout with logout button for the users to take a leave if they get bored using it.

public abstract class ApplicationPage {
  @FindBy(id = "logout")

Next we declare our HomePage as before, only denoting the application structure using inheritance:

public class HomePage extends ApplicationPage {
    @FindBy(id = "greeting")
    public WebElement greeting;

And you instantiate it as usual:

HomePage page = new HomePage();
PageFactory.initElements(driver, page);

This way we're expressing an "is-a" relation between the layout page and the actual page: The home page is an application page. In Java 8 we'll be able to put some of the stuff directly into interfaces and have even more capability to construct pages from functional bits and pieces. Until now for most cases this is more than enough.

The Problem

My integration tests are unreadable because of all the fiddling with Selenium.

The Solution

This is actually a huge problem, not only with Selenium or any other integration/unit tests. At least in Unit tests we have the universal layout ("given/when/then" or if you're more a Microsoft type guy then it'd be "arrange/act/assert"). But what about integration tests? They are expected to have assertions in the middle, they are expected to manipulate our application in many ways and verify the state in the middle because that's what the user would do!

No worry! There's this perfect little design pattern called fluent interface that we can easily employ to nicely structure our tests and to have cool, reusable place for everything! Let's start with the description of the test we're about to write:

- the user navigates to the login page
- verify that the login prompt actually says "Username" and "Password"
- upon entering the proper credentials and clicking "Login" the user lands on the home page
- verify that the login succeeded by checking the page's header or some other element

I suggest you take the description and follow it to the letter using fluent interface which might look something like this:

public class LoginTest {
    public void will_login_properly() {
        new LoginPage("http://localhost/myapp")
            .enterCredentials("johndoe", "secret")
            .assertPageTitleIs("Welcome to this cool application!");

As you can see I've deliberately shifted all the specifics to the page objects, including navigation which is especially neat if you're working in an IDE and if you already have a set of page objects to work with. That way the IDE will actually tell you what you can do after you have "executed" a particular action (clickLogin for example).

Constructing such page objects isn't anything extremely sophisticated. Let's take a look at the LoginPage:

public class LoginPage {
    private final WebDriver driver;

    @FindBy(id = "username")
    WebElement username;

    @FindBy(id = "username-label")
    WebElement usernameLabel;

    @FindBy(id = "password")

    WebElement password;

    @FindBy(id = "password-label")
    WebElement passwordLabel;

    @FindBy(id = "login-button")
    WebElement loginButton;

    public LoginPage(WebDriver driver, String url) {
        this.driver = driver;
        if (url != null && url.length() > 0) {

    public LoginPage assertHasProperUsernameLabel() {
        Assert.assertEquals("Username:", usernameLabel.getText());
        return this;

    public LoginPage assertHasProperPasswordLabel() { ... }

    public LoginPage enterCredentials(...) { ... }

    public HomePage clickLogin() {
        return new HomePage(driver);

As you can see the assertions are contextual and can be re-used at will in any test scenario. The same goes for any operation you'd normally run on pages like filling in the login form and clicking on the login button and you can reuse them as many times as you want. If you combine that with inheritance you'll get a extremely powerful way to describe your application in an object-oriented way. With the fluent interface you'll get a chance to exercise that model in a way that will not make your eyes bleed when you'll get back to the code to fix that one button's location that has moved and all of the sudden all integration tests are failing.

The Problem

My integration tests run every time I build the project using Maven and it takes to much time

The Solution

The solution lies in proper Maven project configuration. Maven has this nice idea of profiles. I'm using a profile called "integration-test" to actually run everything integration-test-related only if that profile is enabled.

It's quite a lot of XML as you can probably imagine so I've prepared an example project that demonstrates the configuration. The crucial part in all this is the configuration of surefire, failsafe and tomcat7 plugins. You can grab the example here. If you do a mvn clean install then no integration tests will get executed but with mvn clean install -Pintegration-test you'll see Tomcat starting for the duration of tests and all the Selenium tests executed against that it.


We've been using this way of writing unit tests with great success for more than a few months now. It works great if you have regular web applications (request/response) and adopting it to any dynamic pages isn't all that difficult mainly because the @FindBy annotation does such an amazing job of hiding the complexity of Selenium.

Happy integration testing!


Maciej Gawinecki said...

Why do you make assertions part of your PageObjects? I remember Martin Fowler discouraging to do so, because "their responsibility is to provide access to the state of the underlying page. It's up to test clients to carry out the assertion logic. [...] I think you can avoid duplication by providing assertion libraries for common assertions - which can also make it easier to provide good diagnostics.". For instance, you could have custom Hamcrest matchers.

Maciej Gawinecki said...

Why do you make assertions part of your PageObjects? I remember Martin Fowler discouraging to do so, because "their responsibility is to provide access to the state of the underlying page. It's up to test clients to carry out the assertion logic. [...] I think you can avoid duplication by providing assertion libraries for common assertions - which can also make it easier to provide good diagnostics.". For instance, you could have custom Hamcrest matchers.

Matthias Hryniszak said...

In the approach I took the page object is a domain object. In that sense it contains both the state and the operations one can perform on it - including asserting certain state.

However since I am a pragmatist more than a purist there is a much more prosaic explanation for me doing so. Let me explain this in an opposite way. In unit tests what we'd like to asses is the result of one unit of work (sometimes a whole method sometimes just part of it and sometimes the result of many methods) working as designed. In this situation the arrange/act/assert pattern works really nicely and it makes no sense whatsoever to put the actual assertion anywhere buy inside the test itself. On the other hand integration tests (and selenium tests in particular) they exercise the some pieces of functionality (hence the name: functional tests) and it is more than probable to go through the same part of the system more than once (for example during data preparation or just by making 2 independent tests assessing 2 variants of the same form - just to have an example to work with. To reduce code duplication and limit the actual interface of the page object at the same time (expose a domain method - we're in the domain of integration testing - rather than state accessors).

To summarize: I'd rather have methods that navigate to certain parts of the system and asses if the expected navigation took place properly and to be able to easily keep the fluent flow than to forcibly move the assertion from the page object to the test just for the sake of purity.

However nothing stands in the way of exposing an "assert" method that takes a closure or harmcrest expression to encode the part that does the check inside the test. Even more, with the use of Groovy's closures the actual assertion body can have intimate access to private parts of the page if need be although I'd keep that solution as last resort.

Maciej Gawinecki said...

Now I understand your point.

I agree that following rules blindly for a sake of purism doesn't make sense.

But there will be definitely real situations, that may require refactoring Page Objects.

1) For instance, once same PageOject will be reused across different tests, the size of the code in the PageObject class may start to grow more and more. Different tests may verify different aspects of the same page element, or different aspects of the page.

2) Another situation is when we observe that similar conditions are verified across different PageObjects, so code of same assertions may be duplicated

Both problems can be attacked by either creating a hierarchy of PageObjects with common assertions in a parent PageObject or by delegating assertions to external classes, e.g. custom parametrized matchers. The latter solution will additionally encourage other testers to reuse assertions/matchers in PageObjects we haven't envisioned in the hierarchy of PageObjects.

Matthias Hryniszak said...

I myself am more for creating a proper inheritance structure (if the application actually depicts such a structure) but backed up by reusable components (more about that to come). There are some good examples but the general idea is more/less like this: If you login to some application it'll most likely have a set of common interface elements that you can easily put in the parent class. Every page that inherits of that base class depicts the "is-a" relation (for example a product page is a shop page). There are situations though where common interface elements (take the select html element for example) are reused all over the application. In that case you'll have a separate class for it that can live outside of the context of the page. Another such example (and a very profound one) is jQuery UI-based dialog that's being created and shown on demand. In this case you'd want the root element of that dialog (the main div so to speak) to be the encapsulating component but all the controls inside of it (take the "X" button in the top-right corner) are common and thus can be reused in all dialog implementations. If for example you present such a dialog upon clicking links in 2 different parts of the system then there's nothing standing in your way to reuse that part of your page fragments. As a bonus you can probably make some good use of your partials here and build your reusable components this way.

Matthias Hryniszak said...

I think the main reason I'm sometimes having a hard time to kind of explain the bits to others is that I naturally think of user interface elements as components because of my Delphi / .NET background where you intuitively knew you have complex elements to work with that could do more than just present text. Granted you could have overused that massively but it gave me a fantastic introduction to composition pattern where user interface otherwise hard to put together has become easy and powerful. There are frameworks that try to build on that idea in the web world (eg. GWT , ExtJS), some of them good, some of them evil but the message it brings to the table is stop thinking only about the content (phone, first name, state, whatever) and start implementing the structure as well (shop page, cart page, are you sure you want to do this dialog, ads sidebar and so on and so on). That way your code has a much firmer grip with the reality in such a way that is natural to fix and/or enhance.