Monday, October 17, 2011

Automate Business Logic First

At PNSQC 2011 last week, I met some very interesting and smart characters. One of them was Douglas Hoffman, current president of the Association for Software Testing. (some links:,

One of Douglas’ observations is that API-level E2E testing is 10 times as fast as graphical user-interface (GUI) level testing. He knows this from a very impressive array of industry experience.

Alan Page writes: “For 95% of all software applications, automating the GUI is a waste of time.”

I agree. For a software product in production or a project under development, assuming that there is some degree of separation between the GUI and the business logic it depends on, it’s quicker and more effective to automation the business logic, not the GUI. Some would call this automating the application programming interface (API).

I currently have the privilege of working on a team that does this right: the GUI layer is thin in terms of logic. The business happens below the API.

Here are some things that make automation that includes the GUI expensive:

·         GUI design can change often, because it’s what is displayed to end users. GUI is complex and laden with personal values. Whenever there’s a change in the GUI, any automation that depends on it must be fixed.

·         GUIs must be localized, and this usually means much more than just changing the displayed strings, introducing an additional level of instability to GUI automation.

·         GUI automation is rife with race conditions due to the nature of display.

·         Brian Marick: “Graphical user interfaces are notoriously unstable.”

·         Automating the GUI takes many more dependencies on the product than automating the API, because the GUI is much more complex than the API. A result of this is that the GUI automation is much less atomic than API automation (see an earlier post, therefore riskier.

·         GUI automation requires an entire display mechanism, at least in software, even if the window is hidden. This involves significant overhead.

I’ve never seen stable GUI automation. What I’ve seen instead is that it’s expensive to keep going with GUI automation, if the team cares about keeping it running.

I’ve seen this many times: automation which includes the GUI fails, and it’s understood – usually correctly, but not always – that it’s just a GUI problem causing instability. This can have the effect of hiding deeper product problems, which can introduce a lot of risk to a project.

Here are some reasons to automate the business logic instead:

·         API automation is simpler and more transparent to write

·         API automation is much more stable

·         API automation runs 10 times as fast as GUI automation (from Douglas Hoffman again) (although, in my experience the difference is even larger)

·         There is no display overhead

·         With API automation, the compiler is your friend (assuming you’re using a strongly-typed language, which you should be. See this post

·         Failures in API automation are much more likely to be actionable and important …

This last point is huge: If API automation fails for some reason other than for dependency failures, timeouts or race conditions (e.g. as mentioned here ) then it’s due to some change in business layers of the product and this instantly becomes a quality issue. It might be due to some intentional design change that the developers are making without first telling Test, but just as often it’s a significant quality issue that is worth filing a bug – and in that case, the team just found a quality failure essentially instantly, so it can be fixed very quickly and downstream risk is minimized. If it’s your automation that found the bug, you’re a hero!

Here’s another reason to focus on the business logic:

I’ve heard it said that it’s better to automation the GUI, because then you’re automating the whole product. At a shallow level, this has the ring of truth, but consider this perspective instead: suppose your team focuses on automating business logic, and therefore gets much more quality information, quicker regression, better coverage etc. Then, a GUI bug is found, and this bug is found later in the software development life cycle (SDLC) than otherwise, but no worries: the risk of deferring a GUI bug fix is very low, because if the application is at all well-designed, none of the business logic depends on the GUI flow.

Manual test will never go away, and people are smart and able to spot all sorts of GUI issues that automation can’t do without huge investment in automated GUI checks. Therefore, the GUI bugs are likely to be found by manual testers anyway, and they’re still relatively low risk because the rest of the product doesn’t depend on the GUI.

This is why I’m happy to focus on business logic automation.

Wednesday, October 12, 2011

No, you can't outsource quality (detour from antipatterns topic)

Due to illness and travel and the desire to put more attention into this, I'm not ready to continue the series of posts on antipatterns at the moment.

Twitter (140 characters?) and my available hardware didn't allow posting at the time, plus I was paying attention and not multi-tasking so here's my discussion two days after the fact.

It was satisfying to skewer Julian Harty in the auditorium this morning, though, if a little bit scary (... do people really believe what he's talking about?).

Harty's theme was "The Death of Testing." To be fair, I think the title and theme may have been influenced by simple business considerations of the PNSQC conference at which this took place, and they're trying to attract people who do software quality professionally to the PNSQC conference by scaring them into fearing for their jobs. If so, it worked, and attendance was high.

I want to give due to Harty's presentation skills; he's very good at engaging the audience.

The main thesis of his talk seemed sincere. He was talking about Google practices, and honestly qualified his comments by pointing out that he left Google in June of last year. (hmm, wonder how that happened...)

The idea is that "testing" in the broad sense of measuring and monitoring the overall quality of the product can be outsourced for free. Google does this with the "Give us feedback" functionality on their sites. The idea is that each of the many, many end-users of Google's products have the opportunity to tell somebody on the appropriate internal that there's some problem, and communicate with some individual at Google about the process of fixing it.

This works rarely, but often enough given that there are so many Google users.

Harty's thesis: this is free for Google, the quality is better because there are more eyeballs, and Google appears to respect customers and strengthen loyalty. Google has successfully outsourced quality.

... Yeah?  Copious steaming bovine excrement.

If I find a good bug this way, and go through the Google-prescribed process of getting it fixed, I could receive a cash prize of a few grand (according to Harty).

Now, suppose this is a security flaw. (There will always, always be security flaws, known or unknown.) Suppose this involves personally identifiable information (PII) i.e. most of Google functionality. Suppose I'm the first to find and characterize it. Suppose it's exploitable, e.g. I can use it to see the PII of anybody I want. Suppose I'm not the most ethical person...

I have a choice: do I report it to Google as they would like me to do, and chance getting a few grand as a reward? Or, do I report it to blackhats, and try to get $ a few million?

Of course I'd go to the blackhats! When I do this, all users of Google are exposed to the risk of identity theft. Identity theft is the worst thing that can happen to you on the internet.

Meanwhile, Google thinks that it has successfully outsourced product quality! Great deal, huh? The stockholders love it. Conference speakers talking about latest trends LOVE it. But the end result is identify theft for large numbers of Google customers.

Outsourcing quality can't possibly work for a company in the long run.

Testing is not dead.

Monday, October 3, 2011

Patterns and antipatterns in automation (part 2 of 3-4 parts)

For a product with significant impact, a pattern is writing test automation in the same language as the product (for Java) or the same framework as the product (for .Net). An antipattern is scripting some other, lighter-weight language.

I know many perceive that a script language e.g. Python, Perl or Javascript is better suited to write test automation because it may be quicker. But, with a strongly-typed compiled language, the compiler is your friend, finding all sorts of errors early. With a decent IDE, intellisense is your friend as well, hinting at what you can do and backing that up with the compiler.

If test automation is written in a different language than the product, then testers are distanced from the product code enough that they don’t have anything to say about it; it’s not their domain. But, product code is usually a good example to follow when writing good, robust, extensible test code. Today at work I entered two significant bugs in product code that I wouldn’t have been able to do if I weren’t continually up with the languages used in the product (a grammar of XML, and C#).

Another reason for having the test code in the same language (or framework) as the product is that you know that no information is lost with remote calls or exceptions rolling up the stack: the stack is the same through product and test code, and the test code can handle product exceptions if that’s called for in the test.

The barrier to entry with C# is actually quite low; a dev can help, and testers don’t need to get fancy with lambda expressions, delegates and partial class declarations. If a tester were to create robust, extensible, OO code anyway, a powerful language is needed.

I’ve seen many problems with interfacing a script language or GUI test framework with a compiled OO language: awkwardness of error handling or verification, dropped information, limited design choices…

What do you think? Do you have a favorite language or framework for test automation, that’s different than the product language?

Topic to be continued tomorrow …

Patterns and antipatterns in automation (part 1 of 2)

Patterns are common ways of solving frequent problems in software.

From the introduction to the famous “Gang of Four” patterns book “Design Patterns” (Gamma, Helm, Johnson, Vlissides): Design Patterns make it easier to reuse successful designs and architectures.

In Test, patterns are used to follow well-established best practices. Antipatterns are patterns of action “ … that may be commonly used but is ineffective and/or counterproductive in practice”

For example, one pattern of action is to automate as many test cases as quickly possible because that’s the performance metric that management is using, and we all want to look productive for the business. When the test cases start failing, ignore them and continue automating test cases, because that’s the metric that management is using.

This is actually an antipattern because it’s counterproductive practice. The business value of automating the product (or parts of the product) is to exercise and regress behavior (make sure it’s still working per design) for parts of the product on a regular schedule. If that can’t be done (because the automation broke) the value of those cases goes away or even worse: the quality team proceeds as if quality was being measured for that part of the product, when it’s not because those relatively high-priority test cases are failing.

There’s a closely-associated antipattern which I’ve addressed in practice for years but for which I credit my friend Adam Yuret (his blog is here for help in crystallizing: test cases are mostly written as if they were manual test cases, and they have value for running through the test case as a human tester and observing the results with human perception and intelligence. If this manual test case is automated, the verifications (aka “checks”) are typically many fewer than what a human might perceive, and they might even be zero i.e. nothing is getting verified and the value of running that automated test is also zero.

Adam maintains that what was a (manual) test case is no longer a test case because the entire flow of the user experience (UX) is no longer being measured; there are zero or more checks that are being measured, but they must explicitly be written, coded, and tested by the person doing the test automation. By default, they’re not.

I disagree with the “test case is no longer a test case” judgment, but this does point out an important truth about quality practice: the value of manual testing never goes away (unless maybe if the product has no graphical user-interface aka GUI).

The antipattern here shows up in two common and related team management practices: the idea that automating a GUI test case means that that a certain part of the product never has to be visited manually ever again, or that the practice of manual testing from the GUI (and, the manual testers) can simply be automated away.

I’ll continue this post tomorrow with more patterns vs. antipatterns…

Saturday, October 1, 2011

Duplication of information is anathema to quality

I drive an electric car: a Nissan Leaf. It's fun, reliable, zippy, quiet and very comfortable, there's no stink and I never buy gas. I have few complaints, but here's one:

When I'm driving, I often want to know the time. I can look above the steering wheel, to see a digital clock, or at the navigation system, to see another digital clock. Unfortunately, these times are almost never the same! They disagree by a minute or more. So, what time is it?

The software development life cycle (SDLC) gets very complicated and involves many engineer-hours of work, plus other investments. There's a lot of information involved.

Yesterday I attended an excellent seminar by Jim Benson (personal blog here ) about Personal Kanban (Jim's book of that title is and it was very enjoyable but I was bothered by the reliance on sticky notes for designing and communicating informtion. Those sticky notes show up across all sorts of agile methodologies. I can see the advantages: it's very social to be collaborative with your team in a room with the sticky notes, and the muscle memory and physicality of moving the stickies around helps communicate and absorb information.

But, I asked (OK, pestered, but Jim was a good sport about it) several questions along these lines: do we really need the sticky notes? There's cost and risk in relying on people to manually translate information from plans or some virtual document to the stickies on the board, then after the meeting with the stickies or at some other time (depending on how the team does it) this has to be carried over to the IT workspaces or documents. There are many potential points of failure, and many potential points of information duplication.

The problem with duplication of information is the same with the two clocks in my car: they can easily get out of sync, and then which do you believe? Information can get lost too, if I reorganize the stickies thinking that someone has already done the maintenance step of writing the information to the appropriate document, where actually that hasn't happened for some reason.

I predict that in a few years, the stickies will be gone because there will be a sufficient hardware and software solution to solve all of the problems that the stickies do, without the cost and risk. Team communication around work items will be more robust and efficient. There won't be any stickies to flutter to the floor (as a few did, during Jim's talk).

Worse than the clocks in my car, and even worse than the stickies, is superfluous docs that get out of sync. By "superfluous" I mean people ignore them, so they get out of sync and/or out of date, so they can cause confusion; for example, a doc that lists test cases that also are in a database.The test cases are probably going to change, and there's a very good chance that the doc will get out of sync, and there's also a good chance that someone will rely on the doc when it represents incorrect information.

Better to limit a document to information that doesn't exist elsewhere, and link docs to other docs and resources (databases, source code repositories) so it's clear where to get and where to update information.

Even worse than all of the above: duplicated logic in code, or duplicated code. See posts here and here .

Use team best practices when writing, and don't be afraid to clean up unused code! Duplicated logic can haunt the team and the product quality.

There are times when information duplication is by design and there's a robust system to keep it so, e.g. databases that are de-normalized or replicated or diagrams to visually represent a coded system... OK, so long as the diagrams are frequently used and updated by stakeholders on the team.

Beware the hazard of things getting out of sync. If the clocks disagree, they both look wrong!