NativeDriver and iOS: First Impressions

Written by Dante on October 2nd, 2011

I’ve just taken on a new client who wants help writing automated tests for their iOS and Android applications. It’s still early days in the native app automation space, but there are already quite a few open source and free-to-use tools available. One of my responsibilities for this project will be to evaluate them and decide which to use. I’m planning to take a look at the following options:

Since 80% of my client’s current mobile app users are on iOS, that’s where we’re going to focus our efforts for the moment — and that’s what I’m going to focus on in this post.

Introducing NativeDriver

In recent weeks, I’ve started investigating NativeDriver. NativeDriver attempts to capitalize on the popularity of the Selenium 2.0 (née WebDriver) API, while also (and more importantly, to my mind) using the same architecture as RemoteWebDriver. NativeDriver is split into a “client” API (currently implemented in Java; although a separate Ruby client also exists) which issues HTTP requests to a “server” embedded into the application you want to automate.

I’ve spent a couple days playing with ND… here are my thoughts, so far:

It Works

I wanted to be able to write that “It Just Works”, but… I followed the instructions on the “Make Your iOS App Testable with NativeDriver” wiki page, and I still had to fiddle with the XCode configuration as described here and here. (Yes, the post behind that second link is written in Japanese — but all you need to know is that your build settings should match the screenshots.) You’ll also have to make sure that your XCode project includes the correct preprocessor macros, as detailed here. That said, being that I have essentially zero iOS development experience, it didn’t take long at all to get up and running.

Native Apps Are Not Web Browsers

To be honest, while I really love the WebDriver API, the idea of using it to drive a native mobile application strikes me as a bit odd. WebDriver doesn’t include concepts like device orientation or gestures, because… well, it’s designed to automate web browsers, not native mobile applications. A few days playing with ND has confirmed my suspicions — there are a number of areas where the API is a bit more confusing or awkward than I’d like. Given the youth of this project (and specifically the iOS support, which was released only 7 weeks ago), I’m not going to nitpick on every little thing. However, I will say that I think the ND API would would benefit if it were no longer coupled via inheritance to the WebDriver API. A NativeDriver client API “inspired by” WebDriver would still provide the benefits of familiarity with the more established tool while also providing a development experience that makes more sense for mobile apps. Looking at the example test provided with the ND source, it’s not hard to imagine a newbie getting confused by the fact that a native mobile application widget is referred to as a WebElement. Huh?

I think that copying that the WebDriver architecture makes a lot of sense, though. That paves the way for client APIs in other languages, and maybe even a different client API that’s more specific to the domain of mobile apps.

It Has a Killer Feature

My client’s app is a hybrid: the UI is composed of both native widgets and HTML/JS contained in a UIWebView. This are where NativeDriver really shines — you can find and act on HTML elements the same as if they were native elements. This is because in addition to lifting the API and architecture from WebDriver, NativeDriver also includes iWebDriver (also known as the server side of the IPhoneDriver). Because UIWebView and is a close (but significantly slower) relative of Mobile Safari, iWebDriver makes it possible for NativeDriver to automate these, as well.

This is pretty fantastic for those of us who want to automate hybrid apps — we don’t have to reinvent the wheel (that is, the WebDriver) when the UI transitions to HTML. Very slick.

You Still Have to Design For Testability

No surprise here — as with web applications, you’ll still have to insert metadata into your app’s UI for the tests to be able to drive it in a robust way. The good news is that you’re probably doing this already — NativeDriver makes use of the accessibility labels inserted into your app by Interface Builder. So if you’ve got a button labeled “Cart”, chances are, you’ll be able to find it by calling:

driver.findElement(By.id("Cart"));

Things get a bit more complicated if your app has regions of unmodifiable, data-driven text. For example, in an e-commerce application, you could imagine a product detail page with a product name, description, etc. By default, the accessibility label will be the the text itself, which is great if you’re using a screen reader to help you navigate the interface. But it’s not so useful if you know the meaning of the field (for example, “Product Name”), and want to use that to get its value (for example, “Huggies Size 5 30 count”). This is easy — you just have to add explicit some additional labels to these text fields. Simply select the widget in XCode (which now incorporates Interface Builder directly) and entering the label in the appropriate field.

A New Tool Means a Young Community

The dev team have established separate mailing lists for the developer and user communities, and so far, traffic has been pretty light on both. Unfortunately, because very few people outside of the dev team have any significant understanding of how the tool works, there aren’t a lot of folks who can help with anything more complicated than “I can’t get the example test to run”. This will undoubtedly change with time (and, of course, more posts like the one you’re reading now). As of this writing, NativeDriver is a part-time project for the development team, and their participation on the lists and commits to the source repo have been infrequent since the release of iOS support last month. They’ve made it clear that this is not ideal, so here’s hoping that their employer will see fit to let them give this project the attention it deserves.

This is the Bleeding Edge

No surprises here — mobile applications themselves are at the forefront of technology, and the practice of using automated tools to verify them is even newer. As such, bugs in the tools are to be expected — I’ve found (and fixed) one or two myself. Don’t let this put you off — but make sure expectations are set accordingly.

All This and Android, Too

I haven’t looked into this yet, but NativeDriver also provides an implementation of the WebDriver interface that can drive an Android app. In Selenium-land, this architecture means that as long as your tests use that interface to drive the browser, you’re able to swap out the implementation at any time. Typically, you’ll set up your build infrastructure so that you can configure this at runtime, meaning that you can set up a matrix of builds that run your test suite on all of the browser/browser version/OS combinations you care to support.

In theory, this means you could do the same thing with NativeDriver — however, my client has two completely separate codebases for their Android and iOS apps, and the user experiences are not guaranteed to be identical. Call me skeptical that we’ll be able to use the exact same tests to verify both apps. Even if that’s true, though, there’s still a huge benefit to not having to learn a different API (or even a different language) in order to write tests for both platforms.

More to Come

There are still a few questions I’m hoping to resolve in the coming weeks:

  • Will the accessibility labels I’m adding to make the app testable make it less usable? If so, can we do something clever to prevent that?
  • How can I make it trivially easy for the app development team to run the test suite on their local environment?
  • And, of course, I’ll be looking at Android support as well

To be continued…

 

5 Comments so far ↓

  1. Giri Senji says:

    Are you able to find By.id(“an id from .xib file”) a native element for example UIImageView or UISlider and perform UITouch events like doubleTap, twoFingerTap etc on it?

  2. Dante says:

    I haven’t tried any advanced gestures yet. In theory, the gestures from Selenium’s Advanced User Interactions API should work as normal, so you should be able to do something like:

    new Actions().doubleClick(theElement).build().perform();

    to get a double-click.

    Unfortunately, the AUI API doesn’t model things like two-finger tap at the moment. This seems like an area where NativeDriver will have to distinguish itself from Selenium to better model the domain of mobile apps.

  3. Dante says:

    Update: NativeDriver doesn’t (yet) support them, but Selenium now contains support for touch-specific gestures!

  4. Jake says:

    What about testing on the actual device as a posed to just the emulator, is that supported?

  5. Dante says:

    This question came up on the nativedriver-users mailing list last month, but so far has gone unanswered.

    In theory, there’s no reason it shouldn’t work, but I’ll have to give it a try and report back.