Thursday, January 22, 2015

Of Planets and Test Automation


This coming July (year 2015), the New Horizons spacecraft will fly by Pluto. Along with that fantastic human achievement, the controversy-that-refuses-to-die will be rejuvenated and kicked around ad nauseam: Is Pluto a planet or not?

The irony here is that since the objective is pure science, as long as communication is served, the labels that people use for the object don’t matter at all. It’s no more than semantics and politics. We can call Pluto a planet, a minor planet, a dwarf planet, a Kuiper Belt object, or a great big sleeping comet, but it makes no difference. Planet or Kuiper Belt object or whatever, Pluto is Pluto.

In the world of software quality, there’s a similar controversy around automated tests: aren’t these just the same thing as manual tests? “Manual test” and “automated test” are the same thing, right, so would it not be more efficient and correct to call them “tests?”

No, automated tests and manual tests are not the same. The objective of a test is pure information, or more precisely, technical value in quality measurement. The difference between “manual” and “automated” is real and valuable, because the value to the software project between manual and automated tests is generally very different.


Generally, manual test have these values:

·         Whether scripted or exploratory, they benefit from human intelligence and powers of observation

·         People can be flexible when running them, to be more efficient

·         The test results benefit from tester smarts

And, these downsides:

·         They can get repetitive or boring

·         Humans make mistakes

·         Humans get tired, have to sleep, must do something else sometimes

·         Test repeatability isn’t always good, due to above factors

Automated tests have these values:

·         Can be fast or extremely fast

·         If well-written, they’re very repeatable

·         If well-designed, they’re very scalable

·         Geography and time-of-day doesn’t matter anymore

·         Automatons don’t tire or get bored

·         Automatons are great for processing large amounts of information efficiently and accurately

And, these downsides:

·         Automatons are idiots, and depending on how the automation is written, they can make huge, immensely stupid mistakes repeatedly

·         An automated test will miss major issues that are both important and obvious to a tester

·         Human tester/programmers have to tell the automaton exactly what to do, but if given a chance, the automaton will mess up anyway

·         Poorly written automation can fail and drop important root cause information bits on the floor, which requires a tester/developer person to follow up and figure out what happened

To create a scripted test means writing out the steps. Almost all of them are run manually at some point, but note that by “manual” I mean that it’s a human who determines the initial result of the test as pass, or blocked or fail with additional information about the blockage or failure. It’s a manual test even if it’s a tester that uses a web browser to call a REST API with XML over HTTP, because the tester completes the test with a result.

At some point, the steps of the script may be automated, so it becomes an automated test; it’s an automated process, not a tester, which determines the initial test result at that point.

Given the above differences between automation and manual tests, is the test the same now that it’s automated? Faster and cheaper, maybe? Faster and cheaper, maybe, but the same value, not at all.

This is why I use the word “check” to describe the automated test: the difference is important, and not understanding the difference could introduce significant business risk. (More on the “risk” below.) Calling it a “check” rather than a “test” is much less of an invitation to the business owners to introduce poorly understood risk. A check is just a kind of test, though, but with well-defined and well-understood verification(s). It’s just like driving around at night when the passenger says to the driver, “Hey, check that the lights are on.”

The risk of mistaking an automated procedure with defined steps for the same steps done by a manual tester is that the tester can see so much more. Testers are smart and observant, automatons not. To automate the test means making the team blind to things that it used to be able to see.

For example, here’s a manual test for a credit union web site

1.       Go to credit union site

2.       Log in

3.       Go to checking page

4.       Check balance

5.       To go “Account Transfer” page

6.       Select $100

7.       Select “From” as “Savings”

8.       Select “To” as “Checking”

9.       Click “Transfer Funds” button

10.   Confirm

11.   Go to checking page

12.   Verify that the transfer happened

13.   Verify correct balances reflected

14.   Logout

 

A tester running this test notices “hey, the nav bar looks out of whack…” takes and annotates a screenshot and enters a bug.


The bug is fixed and regressed by the tester with a note “looks good now.” The above test is also automated, but the navigation bar is not the top priority and one can’t automate a check for “looks good” so that never gets automated.

Assuming that the automation is written and handled well, it regresses the functionality of the steps and makes the verifications often. But, the presentation of the navigation bar isn’t verified any more. The business owners think that the web site and transfer functionality is regressed entirely, but that’s not the case; the automation runs, and the verifications in step 12 and 13 happen, but whether the page is readable to a human or not is simply not tested when the automation is run.


That discrepancy creates the risk that comes with misunderstanding what automation does for you, which in turn is related to the common mistake that automated tests and manual tests are the same thing.


Clarity is important here: automated tests and manual tests are so different, they need to be considered differently to avoid the business risk described above.


For Pluto, whether it’s a planet or not, isn’t important. But, I sure am looking forward to what New Horizons can tell us about it.

 

 

Monday, January 5, 2015

Why a book?

Why is MetaAutomation a book? Usually, people blog their ideas, or travel and consult for hire.

Here's why it's a book, not a set of tweets or blog posts: the book format encourages people to put some attention span into understanding it and finding value, because there is some emotional and intellectual investment involved. The potential value is huge and ground-breaking and radically innovative, but it needs some work up front to understand it.

I published MetaAutomation on Amazon.

See my last post for a summary of the pattern language, and other posts talk about bits of it that fall away into their separate post-size ideas.

For more information, search for MetaAutomation on Amazon, and use the "Look inside" feature. A lot of my book is viewable ...

From the book, feedback to the author is invited and possible through the web site that supports the working sample implementation of the Atomic Check pattern.
 

Wednesday, December 31, 2014

Here is part of the back cover of my book. Click for a more detailed image.



Thursday, December 18, 2014

MetaAutomation is now in print!

The book is on Amazon, here:

http://www.amazon.com/MetaAutomation-Accelerating-Automation-Communication-Actionable/dp/0986270407/ref=sr_1_1

This book presents the pattern language MetaAutomation, an audaciously innovative framework of tools and perspectives to run automation faster and more effective, and greatly accelerate quality value around the team.

IMO this book will get the term "MetaAutomation" onto resumes in a few years, as the next step for software quality power and sophistication.
 

Tuesday, October 7, 2014

Moving forward with automation technology, using the word "Check"





Bach and Bolton (their blog post, “Testing and Checking Refined”) describe the idea that the word “check” is a useful label for some types of activities and procedures commonly called “test.” The issue is that when measuring product quality, it is important to differentiate between what test professionals typically do as they exercise the product, and the value of automated measurements of product behavior with Boolean pass/fail results.

In literature about testing, the word “check” often serves as synonym for “assert” or “verify.” This book proposes a closely related use of the word, similar to what Bach and Bolton described, and an important one to the nature of the first pattern of MetaAutomation, Atomic Check.

 


Many practitioners believe that automation for software quality starts with manual test cases. The manual test case is designed to be executed by a person, and often they are, with useful quality measurement results.

When the manual test case is automated, the conventional wisdom goes, the quality measurement value of that manual test is multiplied many times, because the test can now run faster and more reliably and offline or after hours. In practice, “faster” is true most often, but not always, and “more reliably” is only sometimes true, but running the tests offline and repeatedly is in any case a very significant business value for catching regressions and managing product risk.

On execution by a person, the original manual test had some quality value to the product team. Is that value now covered by the automated version, so the manual test never needs to be run again? No, generally, this is not the case; people are smart and observant, and test professionals can note and characterize (write a bug for) quality issues that automation will not notice, especially for apps with a GUI or web sites. On execution by a test professional, and according to expectations of people performing that role, the original manual test measures much more than just the written test procedure and verifications. People notice stuff, especially good testers. Automation only notices what is required for the automation to run, plus explicitly coded verifications or assertions.

If the team automates a manual test, and thereafter no person ever runs the original test, along with the well-understood gains of automation comes a significant but poorly understood loss in team capability of measuring quality. The person who might have taken time to run the manual test now has more time to add value in other ways, but all of the human-observable aspects of product quality that are implied by or incidental to the manual test are now going unmeasured. The automation can add verifications, but speaking from extensive experience in automating for quality, coded verifications of product quality that do not break the automation are very limited in number relative to the range of issues a tester can notice.

Automating manual tests, combined with the above misunderstanding, creates significant quality risk.

An example


Here is a manual test for a hypothetical team working on a banking web site. Call these steps “transfer balances:”

1.       Browse to the bank site

2.       Login as test user

3.       Note checking balance

4.       Note savings balance

5.       Go to transfer page

6.       Transfer $5 from savings into checking

7.       Verify correct checking balance

8.       Verify correct savings balance

9.       Logout

When a person runs this test, in addition to the explicit steps and verifications, she might notice and bug a huge number of potential issues, for example:

·         The browser shows a security warning about the site certificate

·         The balance in savings has gone negative

·         The ad on the page messes up page rendering

·         An incorrect name is shown for the greeting

·         The protocol serving the page does not have SSL

·         There is a spurious alert message

·        

Later, she automates this test, with the verifications as written. Management is happy and she is happy because, they think, she never has to run the manual test again.

It is true that the importance of running this manual test is reduced when it is automated, because what the team decided are the most important business issues are now verified automatically. However, the need for a team member to do these or closely related procedures manually does not actually go away.

The common misunderstanding is that after the manual test is automated, there is no need to run it manually or even test around that scenario. Call this case “A.” On the other hand, we have case “B,” the understanding that there must still be some manual testing around this, because the automation is very limited in ability to measure the details of a quality experience for the product.

The judgments represented by cases A and B cause different actions from the team around measuring quality, after the “transfer balances” steps from the example test have been automated. The actions and inactions in case A cause a gap in quality measurement, because the bulleted items may no longer be measured. The longer these issues are broken, the more potential for downstream issues, and for quality regressions, difficulty and cost in finding and correcting the root cause of the issue increase rapidly with the time lapse between the issue occurring in product code and discovery by the team.

That all adds up to create the risk of case A, relative to case B. Case A can potentially result in such issues as:

·         Basic product issues discovered late in the product cycle

·         Issues shipped to customers, so the customers find serious issues before the team knows about them

From a project perspective, an important root cause of the management error of case A is that “transfer balances” starts as a manual test, and when it is automated, it becomes an “automated test.” It still looks like the same test to teams afflicted by the misunderstanding of case A, except that it now runs repeatedly and with lower personnel cost. The word “test” still applies, and that trips the team up.

 


Words are labels, and choice of labels is easily dismissed as “just semantics,” but words have connotations as well as denotations.

A manual test is a test. If that test is automated, the result is still a type of test, but to highlight the change that automation brings, this book uses and recommends the noun “check” for that purpose.

Checks do not have nearly the powers of observation that a person does. Any verifications that the check does can be implicit, that is, comes with the procedure code anyway, or explicit, which requires an explicitly coded verification.

Use the term “check” every time a measurement is made of the product where

1.       It is an end to end test

2.       The measurement procedure runs without human presence or intervention

3.       The measurement procedure completes with a pass/fail result

The word “check” works as a noun, for example: “A check is a better label to use than automated test.” It also works as a verb, for example: “Execute this set of automation to check that the end-to-end product behavior is still correct.”

This terminology clarifies what automated testing does, but even better, it avoids eclipsing manual testing around the functionality.

Promoting the term “check” instead of “automated test” emphasizes the limitation of automation, and makes it clear that, especially when working with a web site or other GUI, some manual testing still needs to be done. Checks make the manual testing easier and less tedious by removing the need to check the important business-logic behaviors of the product, but they do not remove the need to run the manual tests or to do exploratory testing around the tested feature.

“Check” is still a kind of test, but think of it like cheese that originated in the Brie region of France; one can call it “cheese” and be correct, but the much preferred and more efficient term is “Brie” and people sound more discerning, educated and domain-aware when they use the latter term.

Readers may be wondering at this point: How about unit tests? Should we call them “unit checks” now?

This book uses “unit test.” Unlike with automated end-to-end tests, that were originally written as manual tests but then automated, unit tests are never manual in origin so the risk described above is not an issue.

This book also uses “check” for an API or service test, as long as all dependencies are in place. This is useful for techniques such as bottom-up testing, for which the Atomic Check pattern is especially powerful.

Another advantage of “check” is that it makes it easier to see that the best checks are designed and grouped differently than manual tests. There is much more on this point in the book “MetaAutomation,” in Chapter 3 on the pattern Atomic Check.

 

Monday, July 14, 2014

Time to retire the phrase "Automated Testing" and use "Checking" instead


UPDATE to this post, April 18th, 2015: for purposes of automated testing, I'd like to define "check" as a specialization or subclass of "test" where the verifications are limited to those specifically coded or otherwise determined in advance. The usefulness of "check" is limited to what people also call "automated test," and has a business justification that it avoids confusion and risk: it avoids the confusion that a manual test involving a GUI or web page, once automated, obviates the need for a human to run the manual test, and it avoids the quality and business risk that would result from losing the corresponding measurements of product quality.

***

It’s time to retire the phrase “automated testing.”

Given that software testing is about measuring, communicating and promoting quality, leadership often sees automation – that is, making a software product do things automatically –as a way of doing all of the above faster. Unfortunately, it does not work that way.

People are smart, but computers and computing power are not smart. People running user stories or test cases or doing exploratory testing are very good at finding large numbers of bugs, within the limits of attention, getting tired or bored, etc. People are great at spotting things that are not as they should be, e.g., a flicker in an icon over here or a misalignment of a table over there, or a problem of discoverability.

When automated product testing is done well, it has huge value: it is excellent at regressing product quality issues quickly, repeatedly, and tirelessly. Automation does not get tired or bored. Computers are very good at processing numbers and repeating procedures, and doing them fast and reliably, and at e.g. 3AM local time when your people are home sleeping.

However, automation is not good at finding product bugs or anomalous issues like a flickering icon. You need good human testers for that.

Instead of “automated testing,” it’s time to use a term proposed by James Bach and Michael Bolton (see their post http://www.satisfice.com/blog/archives/856 ) to define automation that drives tests: Checking. A single automated procedure that measures a defined aspect of quality for the SUT is a “check.” The term “check” applies where more commonly a professional in the space might use the term “automated test” but since testing is an intelligent activity done by humans, the term “automated test” becomes an oxymoron; once a test is automated, it is no longer a test in the same sense. If done well, it is fast, reliable, tireless and highly repeatable, but the value is very different from the same procedure run by a testing professional.

A skilled and experienced tester running a manual test can discover, characterize and describe as a bug any of a broad range of issues. The range of potential issues found has few limits, and is driven by the intelligence, creativity and observational skill of the tester. However, to take that manual test, automate it, and then run it offline (without human observation or intervention) or in the lab, severely restricts the discoverable range of issues. Automated tests are capable of flagging issues that block the procedure of the test, and issues that are the topic of explicit verifications or metrics coded into the test or automated test harness, but they do not measure anything else about the product. The automated test will usually run faster than the manual test, and a well-written automated test will run more reliably and repetitively than the manual test, but it does not replace the manual test. If the automated test taken from the manual test above runs and passes a thousand times, running the manual version of the test once could still find important issues.

The team therefore still needs the manual test, and in the context of measuring product quality, there is benefit to tracking the manual test and the manual test results separately from the automated test and the automation results. The manual and the automated version of the test both have their values for quality, and one does not replace the other.

It’s time for the industry to use “check” because this term emphasizes that automating a test is not the same as running the test faster and does not obviate the manual version of the test.  The need for manual testers will always be there for the team; however, a well-designed and frequently-run set of checks can make manual testing faster, more effective, and more fun because it makes for less manual repetition of measurements that are verified by automation and more exploration around the manual test.

In addition, the best-written manual tests are significantly different from ideal checks. Manual (or quasi-manual) tests tend to focus on scenarios or mini-scenarios, because that is the natural usage for end-users, and it gives testers the most opportunities to find issues and characterize them as bugs to be considered by the team. Checks focus on specific verifications, and ideally are as short as possible.

“Check” means that that the verifications are strictly limited to what is specified in advance, either by the coded-in verifications, verifications for the test group (if that is implemented) or by the test harness as a whole. In the context of automated testing, this specification might be specified in prose, but is always specified in the code that is written. This works very well for an automated test, because it is important to be completely consistent over test runs with what is and is not verified about the SUT.

 

This post is based on an excerpt from Matt Griscom’s forthcoming book, MetaAutomation.

Thursday, May 8, 2014

Who Writes the Automation?


Consider two types of automation: unit tests, and end-to-end (E2E) tests.

Current practice is to have the developers themselves write unit tests, especially when doing test-driven development (TDD). Unit tests are fast and lightweight and generally built into the product build process, so failures either happen on the developer’s workstation before new code is shared with the team, or as part of an integration build or other build which happens as part of the dev team flow and prior even to deployment. This tight cycle reinforces the value of developers writing their own unit tests. The risk is when the unit test depends on the implementation of the unit; this might block the refactoring capability that unit tests should offer, but also calls into question the value of the unit test itself; it might be measuring something other than product value. A perspective from somebody other than the owner of the product unit is therefore helpful as a way of limiting risk.

A Current trend is having the developers themselves write E2E test automation as well. The developers are more likely to have the software development chops needed, and their deep product knowledge might speed things up.

Testing and test automation done well is a more challenging and open-ended responsibility than pure software development, because there are more unknowns, abstractions, and dependencies to consider. The developer focuses on creating and shipping a good product and that is hugely difficult and often takes immense training as well, but then measuring the quality of the product is a different task and sometimes even presents a conflict with the developers’ focus.

For example, if someone in a test role files a bug on performance of the product, the bug might prompt an action item for refactoring work and hence delay for the developer role in meeting goals. Such a bug is a good thing for product quality generally because consideration of the bug by the larger team is one of the steps towards shipping a good product as balanced with business needs. However, that step might never happen if the person who might initiate it is the same person who might be conflicted by it; it is simply human nature to overlook it in that case, even assuming the best of intentions.

If the test team is able to create and maintain the E2E automation and meet the other requirements as described by MetaAutomation, then better that they do it and give the developers more time and focus to create a great product.

Wednesday, March 19, 2014

MetaAutomation: The Abstract for PNSQC (October, 2014)


Regression testing automation provides an important measure of product quality and can keep the quality moving forward during the SDLC. Unfortunately automation can take a long time to run, and automation failures generally must be debugged and triaged by the test automation team before any action item can be considered or communicated to the broader team. The resulting time lag and uncertainty greatly reduces the value of the automation, and causes cost, risk etc.

MetaAutomation is a language of five patterns that provide guidance to new and existing automation efforts, to provide fast and reliable regression of correct business behavior for a software solution and to speed quality communication around the team, while reducing latency and human cost.

The five patterns – Atomic Check, User Pool, Parallel Run, Smart Retry and Automated Triage - are in a sequence, representing an order in which the patterns can be applied, and also form a network of dependencies between the patterns.

For an existing automation project, the Atomic Check pattern can be applied in whole or in parts to run the automation faster (e.g. with shorter and better-defined tests) and create results which are more actionable (e.g. with asynchronous and/or inline test setup, hierarchical steps defined at runtime, explicit verifications, custom exceptions, etc.). If enough of Atomic Check is adhered to, the dependent patterns can then be applied to further speed, direct and enhance the value of communications resulting from the automation.

The patterns are language-independent. A platform-independent sample implementation of the Atomic Test pattern will be demonstrated in C#.

Thursday, November 14, 2013

MetaAutomation: The Book

MetaAutomation is moving from a blog to a book.

I've been writing, pulling ideas together, and synthesizing, to create a package that represents a stronger value proposition to people concerned with automated software testing that includes any or all of these:
  • regression tests
  • functional tests
  • all tests that are initially or intended to become fully repeatable
  • positive and negative tests
The package (and the book) is for tests that include functional dependencies (internal and external services etc.) and generally don't have fakes or shims. So, it does NOT address any of

  • security tests
  • performance tests
  • stress tests
  • code metrics
  • model-based testing
  • fault injection
  • accessibility
  • discoverability
  • suitability or validation
The book doesn't address every topic of this blog, but it does create a valuable big-picture synthesis that isn't possible in the blog format.




 

Tuesday, September 24, 2013

Smart Retry Your Automated Tests for Quality Value


If you automate a graphical user interface (GUI) or a web browser, you’re very familiar with this problem: there are many sporadic, one-off failures in the tests. Race conditions that are tricky or impossible to synchronize and failures from factors beyond your control or ownership break your tests, and the solution too often is to run the test again and see if it passes the 2nd time.

The result is dissonance and distraction for whoever’s running the tests: there’s another test failure. Does it matter? Do I just have to try it again? I’ll try it again, and hope the failure goes away.

Imagine transitioning your job from one where most issues that come to your attention are not actionable (e.g. “just ignore it, or try it again and hope the issue goes away”), to one where most issues that come to your attention are actionable. That sure would help your productivity, wouldn’t it?

I wrote about this topic here in some detail:


Now’s a good time for your organization to bring it up again. Smart Retry is an aspect of 2nd-order MetaAutomation:


Smart Retry is very valuable for your productivity and communication around the organization, but if you want to get there, you need two things which each have significant value in themselves:


2.       Tests that fail fast with good reporting http://metaautomation.blogspot.com/2011/09/fail-soon-fail-fast.html

And:

3.       A process with some programmability to run your tests for you and make decisions based on the results

On item 3: If you are running your tests in parallel on different machines or virtual machines or in the cloud, you will have this already, and if you don’t have this, you will because the business value makes it inevitable.

For a distributed system, you will need also a non-trivial solution for this:

4.       A service that provides users for given roles from a user pool, for time-bound use with an automated test

A Smart Retry system is an automated solution to substitute for a big piece of human judgment: whether to just run the test again, vs. taking a significant action item on it. It adds a lot of business value in itself, and it also complements other systems that scale and strengthen the Quality story of your organization.