Please look to The MetaAutomation Web Site for more current information and the new blog!
MetaAutomation
MetaAutomation starts with making automation failures actionable, maximizing the value of automation results, and continues by automating triage. MetaAutomation reduces the cost of fixing existing automation and ensures that automation helps your quality measurements and improvements, rather than hindering them.
Tuesday, September 18, 2018
Thursday, June 23, 2016
’Test Automation’ is a Historical Accident and an Oxymoron. Time to Move On to Something Better!
The phrase “Test Automation” is a historical accident, and
not a victimless one either.
“Automation” started around the 1950’s with industrial
automation; building stuff faster, better, and cheaper. It continues to grow
today, with more industrial robots every year.
DevOps is growing too, with automation moving software
through the development process to ship, faster, better and cheaper. DevOps is
true to the meaning of “automation,” because the focus is on the end product:
shipped software.
“Test” is the traditional word used for measuring software
quality and finding issues (bugs). When people started driving the system under
test (SUT) automatically to help with this, “test automation” was an obvious
way to describe it, but it was and is a poor fit: unlike with industrial
automation or DevOps, given that the software product is an SUT (or, at least
it’s being driven with fake, test users and fake, test data) the end product
has zero value; nobody cares. Instead of the end product of the SUT, people
instead focused on the pass or fail result of running a bit of “test
automation.”
“Test automation” encourages the perception that what people
in the manual testing role, what all people involved in software development do
at least part of the time, can be automated away. Wrong! People are smart,
observant, flexible and perceptive. Automated measurements of software quality
can be very powerful for the business (as I describe below) but they can’t do
what people can do.
It got worse with another historical “oops:” In 1979,
Glenford Myers wrote what turned out to be a highly influential book on software
“test,” and in this book he insisted, repeatedly and emphatically, that the
whole point of “test” in general is to find “bugs.” This reinforced the
perception that for “test automation,” if it passed, nothing else matters; it
didn’t find a bug, so we don’t care about any other details. Oops… although in
1979, this was a fairly good approximation, the conceptual mistake remained and
gradually became much more significant through now, 37 years later, and beyond.
In 2016, we have software doing some critically important
things, starting with online banking through web portals, through self-driving
cars and on passenger airplanes. Software flaws could ruin a person
financially, or kill her, and the magnitude of software’s impact on our lives
grows every year.
We can no longer afford to be distracted by the misleading “test
automation” or the idea that test is only about finding bugs. We must pay
attention to how we drive the product, and how it responds. We must know
immediately, and in detail, of flaws and even when some part of the product is
not measurable due to some other failure. We can’t afford to wait for some
person in quality assurance (QA) to debug through a problem, especially not at
nighttime or across a geographically dispersed team. We can’t afford to
continue dropping important, actionable quality information, functional and
performance information, on the floor, and hope that if it was important and actionable,
somebody on the team might dig it out later.
For software that matters, we must record that information,
store it and act on it if appropriate, with automation in real time and, also,
make it relevant and queryable to anybody on the team who cares to look to get
information on functional and performance quality, including near-term events
and long-term trends.
Replace “test automation” with quality automation. “Test
automation” only really works for the QA team, and not very well either.
Quality automation avoids the misconceptions of “test automation” and, by
focusing and using what automation
does and does well, delivers transparency and business value across the larger
software team.
Log statements in the automation do add some value, but any
structure in the actions or log statements is lost because it becomes a list of
loosely-formatted statements, lossy of information, not very queryable, not
friendly to automated processes after the measurements are completed.
BDD and Gherkin offer a way of logging business-facing steps
with check runs, but this requires a tool to be installed and distributed and
interpreted keywords to be implemented. Information on technology-facing steps
is lost. Implementation of keywords tends to drift when reuse is required (I
know, been there and done that) and this information is lost, too, to the
relative obscurity of the keyword implementation source code, and never to be
exposed outside QA.
MetaAutomation shows a better way, starting with the Atomic
Check pattern: don’t interpret or implement keywords; instead, drive the
product with self-documenting steps. Put the steps in a natural hierarchy, like
top-down modelling in business process modelling, and have them document
themselves in this hierarchy.
Now, the business-facing steps and every technology-facing
step that drives the product is self-documenting, in compact and highly
queryable valid XML, in a hierarchy that reflects the process of driving the
SUT for every check. Every step records how many milliseconds it took to
complete that step, at every node in the hierarchy.
No interpreter needed! No keywords to implement, no Gherkin
language to adapt and learn! No 3rd-party tools to install, deploy,
or update!
This is just the start for quality automation, though: the
detailed self-documentation of the checks supports the Smart Retry pattern.
This is the answer to working with checks that can fail intermittently due to a
variety of reasons: now that a check result documents itself in how the product
was driven, in the context of driving the product (and potentially, with additional
instrumentation data placed in the structured data result of the check) root
cause is now nicely recorded. An implementation of Smart Retry can now answer
these questions, and take action, in
real time:
·
Was the failure due to an external dependency?
·
Should I retry the check, or not?
·
Did I just reproduce the failure?
All the data is recorded. Flaky tests are a thing of the
past; no need to mark tests as “untrustworthy,” no need to interrupt an
engineer’s workflow with an impromptu debug session.
Atomic Check also ensures that the data is there so
Automated Triage will work. Emails send to long DLs to beg a group of busy
engineers “will somebody please follow up on this? There might be a problem
here” are a thing of the past. Email folders filled with what is commonly
viewed as annoying SPAM, are no more.
http://MetaAutomation.net has more information and two working
samples to illustrate:
The first is easy to set up, run, modify, and reuse, and
will run across processes on one machine. For example, one check can run across
any number of processes.
The second is more work to set up, but will run checks
across any number of machines or VMs. Running a single check across multiple
machines is how one does end-to-end checks for the Internet of Things.
Both samples use a free version of Visual Studio 2015, and
come with instructions.
Recognize the “oops!!” of “test automation” and “test is
just about finding bugs.” Grieve for a moment – we’re all entitled to that! - then
move on to something much better.
Quality automation is inevitable. MetaAutomation describes a
language-independent and platform-independent way to get there. Why a pattern
language of six patterns? Because that’s by far the best way I can describe it.
Vastly greater productivity, transparency, and team
happiness await.
Monday, May 23, 2016
Post #6 of 6: For Effective Quality Automation, Know the Limits
One of the very basic values of MetaAutomation is to know
what the manual testing role is good at, and is, in fact, indispensable.
By “manual testing role” I mean any person on the team who
ever does anything with the software system under test (SUT) and is in a
position to notice something awry, some issue where the software behavior (or
even non-functional quality issue) does not meet the requirements or somebody’s
expectations, and can characterize and record the issue (i.e., “bug” it) for
the team so the issue can be considered for a potential action item to fix it.
So, this does not necessarily require a person who is full-time committed to a test
role or a manual test role.
People are smart. People are clever, innovative, flexible
and observant. People notice stuff and can communicate it to other people (or,
record it for their own records). Quality automation, or the automation that
makes and communicates quality measurements, is very good and powerful at
measuring and reporting on performance and steps in driving the product, and
regressing functional requirements, but there’s a lot of stuff that it can’t
see at all because it’s difficult, risky, or impossible to measure many things
about the SUT.
For example, a web page layout: is the page attractive,
readable, and usable? Quality automation won’t tell you. One needs the manual
test role to follow up. Fortunately, you’re doing this anyway, even if nobody on
the team thinks of him or herself as a “tester,” assuming somebody on the team
checks over the product before it goes live.
Poor understanding of the boundary between what manual test
is better suited for, and what quality automation should be written to do, is
expensive in terms of product cost and risk.
For example, I’ve seen too much put inappropriately on the
manual testing role; on a credit union web app, giving manual testers ownership
to verify correct bank balances is very expensive and risky because that aspect
of the product is very high priority but it might not get verified reliably by
manual testers for any number of reasons, and in any case, it’s slow to do that
manually. The result is extra cost and risk.
I’ve also seen too much put on quality automation: trying to
verify many low-priority and relatively superficial aspects with quality
automation can be tricky to write and maintain and flaky to run. For example,
is a control on the screen the correct color? Unless there’s a high-priority
functional requirement there, it’s better to skip that with quality automation.
Checking such properties with quality automation can make checks that are too
complicated, slow or flaky, or too many checks run to verify low-priority
aspects of product quality.
Quality automation and manual testing have (or should have) a
relationship: quality automation checks these things, and manual testing
verifies the rest, notices odd stuff, and does exploratory testing. The
relationship depends on knowing where the team has decided the boundaries and
limits are; if the manual testing role doesn’t know what has already been
checked, for a given product version and build, there will be missed stuff and
duplicated work.
Keep checks simple and well-documented so what is verified
is clearly understood. The Atomic Check pattern of MetaAutomation describes how
to make checks “atomic,” indivisible really, so that they can’t be simplified
by breaking up the check into smaller checks.
Documentation is good but can be expensive and risks falling
out of date due to minor changes. Atomic Check describes the ideal solution
here: self-documentation of the checks! Even better, naturally hierarchical
check steps enable self-documentation of the business-facing logic of the check,
easily displayed, and the atomic technology-facing check steps at the same
time.
How does that work?
Download and run one or both samples on http://MetaAutomation.net to see this in
action. Step through the code, make changes, and even implement your own atomic
checks with hierarchical self-documenting steps.
If that is too much change for your team right now, here’s a
take-away you can use immediately: knowing the boundary between quality
automation and manual test reduces risk and effort, and ensures that no aspect
of product quality is unintentionally missed.
This page is #6 in a series of six, with the original post
here.
Friday, May 20, 2016
Post #5 of 6: Good Code Practices for Quality Automation
If MetaAutomation is too much
change for right now…
Start with making your code
as good as it can be.
I quote Elisabeth Hendrickson in the book on MetaAutomation:
It is tempting for organizations to treat infrastructure
code as somehow inferior to production code. 'It's just scripts,' they say. 'We
don't need to put seasoned engineers on it.' However what I've seen is that the
infrastructure code -- that includes build and CI scripts as well as tests and
test frameworks -- is the foundation on which the rest of the code is built. If
the infrastructure code is not treated with the same care as production code,
then everything is built on a shaky foundation.
Teams should follow the team’s adopted practices for product
code where it makes sense, but also consider making some changes for quality automation
code:
· Some types of security and performance
considerations for product code might be unnecessary in quality automation
code, since the latter will never ship outside the team’s control.
· Consider using some structures to help with quality
automation-specific code, e.g., to make the actions of driving the product
self-documenting in detail. The sample projects on http://MetaAutomation.net show
many details on this.
Here too, there are more details in the book on MetaAutomation.
If you have a choice, use a compiled language. The idea that
an interpreted language like Python can make QA engineers more productive is an
illusion, because not having a compiler on your side can result in runtime
failures that a compiler would have caught, causing significant time and cost
in following up. An interpreted language lets one write code faster, but that’s not the whole value story. A compiled
language like C# will tell you about all sorts of problems immediately, and
many others almost immediately and certainly before committing code to a
repository. This saves a lot of time.
As a counter-example, I’ve also used Ruby to do quality
automation, and among many other shortcomings of this language, it effectively
hid problems from me until much later than writing the code, causing significant
cost and frustration.
Consider using the same language as the product code, so
less information gets lost at the boundary between the product and quality automation
code driving the product, the dev role can contribute as needed to quality
automation, and the QA role can know more of the product and even contribute to
the product as needed.
Personal anecdote: I’ve added XSL code to a product, mainly
written in Java, to make an important product web site vastly more testable
using both Java and XSL in the quality automation code. This kind of thing is
much easier if the languages are common between quality automation code and
product code.
Careful to avoid copy-and-paste code; this causes
maintenance cost, because the code might have to be updated in multiple places
(since the same code is copied multiple places) and if the coder misses one or
more of the copies, more wasted time and cost result.
For checks that use a GUI or a web browser, always
synchronize to objects or events if possible, rather than sleep. This will help
performance and reliability of the checks.
A few more minor points:
·
Maximize code reusability and reuse.
·
Make symbol names descriptive, so code becomes
self-documenting.
·
Comment code but only if needed, and always at
one level of abstraction above the code itself.
·
Always do code reviews! It’s an opportunity to
learn from each other and raise code quality and uniformity across the team.
This page
is #5 in a series of six, with the original post here.
Thursday, May 19, 2016
Post #4 of 6: Make Your Checks Simple
If
MetaAutomation is too much change for right now…
Make
your checks simple, to improve scalability of your check runs with more
resources and to improve the business value of your quality data. (OK, that’s “improve
the quality of your quality data” and yes I know that might sound silly.)
There
is a pattern called “Chained Tests” that occurs in the wild, and described
here: http://xunitpatterns.com/Chained%20Tests.html
but it’s actually a very bad idea. I’d call it an antipattern. The linked page
goes into the negatives a bit, but I will add a more modern reason to NEVER
chain your tests: they won’t scale. Chained tests must be run in a sequence, so
it doesn’t matter how much computing resources you allocate, the check (aka “test”)
run won’t go any faster.
As
an extension of the “scalability” reasoning, make your checks simple. Each
check should have no more than one target verification or verification cluster,
and NOT have superfluous verifications that can either slow the check down
and/or make it less robust and more fragile than it would otherwise be.
Here is a color version of one of the diagrams illustrating the Atomic Check pattern in the book on MetaAutomation:
Another
way to simplify the checks, and make them faster, is to use the Precondition
Pool pattern of MetaAutomation.
This
page is #4 in a series of six, with the original post here.
Wednesday, May 18, 2016
Post #3 of 6: Make your check fails self-documenting
If MetaAutomation is too much change for right now…
Verify that you’re doing all you can to minimize debugging
costs, in case of check failure.
1.
Add preliminary verifications wherever
null-reference exceptions occur
A null-reference exception (or null-pointer exception for a
native language) will lose information: which symbol was null? There is often
more than one possibility on a given line number. If there’s an ambiguity, it
might require a debug session to find out, and in some cases, the line number
might have changed for the version of the code you can access versus the one
that was running and threw the exception.
Labeled preliminary verifications, e.g., asserting non-null “<some
API> returned null>” will find the condition sooner and show it with much
better specificity than if there were no such verification. With complex code
this can make the difference between a clearly understood root cause of check
failure, and a very confused one.
If there isn’t a preliminary verification for where the null
could potentially happen, and the check failure can’t be reproduced easily, a
debug session can be either very expensive or even useless, meaning that the
information of the failure is lost forever.
2.
Verify that asserts or verifications are
self-documenting
Labels for thrown exceptions add value in two ways: First,
they describe what happened, to make resolution of a check failure easier, and
second, they ensure that there is no ambiguity of where or under what
circumstances the failure occurred. Using the built-in exceptions can make
sense for produce code where the standard for performance is high, but for
quality automation code, use an assert or a custom exception to make the
failure explicitly self-describing.
3.
Add more log statements around points of failure
Consider the difference in cost between adding a few more
log statements around parts of checks that are likely to fail, and reproducing
and debugging through the failure, maybe more than once, especially if the
failure is hard to reproduce.
Adding log statements is much cheaper and lower-risk.
4.
Make all checks fail on an exception, even
negative cases
For negative checks that are expected to fail, enclose the
expected failure in code that verifies the expected failure condition, then
throw another exception in case of failure to fail the check.
Exceptions are intended to be used in exceptional
circumstances. Overuse of exceptions, or too much code that is expected to be
bypassed in case of an exception, reduces reusable code in the quality
automation infrastructure because a new pattern would be required for coding
the negative check method signatures, and makes any cleanup more complex.
The Atomic Check pattern of MetaAutomation shows how to do all
this with much more structure, transparency, and detailed hierarchical
self-documenting check code.
This page is #3 in a series of six, with the original post
here.
Tuesday, May 17, 2016
Post #2 of 6: Basic Check Design
If MetaAutomation is too much change for right now…
Here are some easy steps to take:
1.
Make your checks independent of each other.
This is how checks can be fast and scalable: they must be
independent of each other. Even the simplicity of the checks depends on them
being independent of each other, and in remember in case of a failure, the value
of what a check can find can depend on simplicity, because with long or complex
checks, things can get muddy and any business value can be clouded and even
lost because it becomes too much work to debug through the check to try to get
the actionable information out, surface it and describe it for the benefit of
the larger team.
This is a core principle of the Atomic Check pattern of
MetaAutomation: all checks are independent of each other. This is how checks
can scale across resources. If the “Chained Tests” antipattern is still used,
checks can’t scale at all because they still have to happen in a time sequence,
and any failure might depend on an operation earlier in the sequence, so the
value of any issues found depends first on a lot of work to reproduce that
sequence of operations.
2.
Prioritize your checks based on business
requirements (or even functional requirements)
It seems obvious, doesn’t it? Prioritize… work on the more
important parts of your product first.
If your team doesn’t have business requirements, you probably
don’t know what you’re building, and the risk is that you’re building the wrong
thing for your customers. See the excellent book by Robin Goldsmith on this
topic, for example.
At least, however you’re defining your requirements (e.g.,
with user stories), prioritize them and prioritize your checks based on
requirements.
This is part of the Atomic Check pattern as well.
3.
Ensure that your automated checks are completely
repeatable. If needed, record the parameters used so you can repeat them as
needed.
Whether or not you’ve ever had a failure in any one check,
you need to be able to repeat the check exactly.
Otherwise, you can’t be sure that quality is always getting
better because what aspect of quality you measured before isn’t necessarily the
aspect you’re measuring now, and maybe you just got lucky today.
In case of failure, you need a record to do the failure
again, or if you can’t reproduce it, go looking for what the real problem is,
e.g., a race condition somewhere.
This is part of the Atomic Check pattern as well.
4.
Move setup and teardown operations out-of-line.
Any check might have preliminary setup steps and/or
tear-down steps that can be moved out-of-line. The question is this: can any
setup or teardown be done asynchronously and in some other process, in some
other memory space?
For example, allocating and initializing an environment in
which to run a check; that should be done out-of-line.
Allocating, initializing or re-initializing a user identity:
that should be done out-of-line.
Any other state for the check, e.g. some file image or
system state, that can be done out-of-line and out-of-process, should be done
that way, because it means that your checks can run faster and more scalably.
The Precondition Pool pattern of MetaAutomation addresses
this.
This page is #2 in a series of six, with the original post
here.
Monday, May 16, 2016
Post #1 of 6: If MetaAutomation is too much change for right now…
Most of the patterns and values of MetaAutomation depend on
the pattern Atomic Check, and Atomic Check might be a significant change from
the way you do quality automation right now.
So, if you’re not ready to take on the big change right now,
what can you use of MetaAutomation right now to improve your quality
automation?
Following is a series of five blog entries to address some
low-cost things you can do to improve the value of your quality automation.
Tuesday, April 19, 2016
The New Pattern for Quality Automation
There’s a new pattern in town. Actually, it’s a pattern language
called MetaAutomation, composed of six patterns, every one based on existing patterns
of problem solving in information technology (IT) but with the addition of World
Wide Web Consortium (W3C) XML technologies to make them much more powerful in
combination.
The existing pattern “Test Automation Framework” (TAF) (Meszaros,
“xUnit Test Patterns,” 2007, p. 298) describes existing practices for driving
and measuring a software product under development. This works, and it delivers
value of course, but it’s limited. With TAF, for example:
·
The customer of the information is people on the
QA team. Somebody on the QA team must manually interpret and present the information
to the wider software team.
·
The difference between what a human tester and
an “automated test” can do is not addressed, causing business risk and
opportunity cost.
·
The issues of blocked product quality
measurements due to failures and flaky checks is not addressed.
·
The issues of prioritization of measurements is
barely addressed.
·
The goal of actionable check failures is not
addressed.
MetaAutomation solves all of these problems, and more:
·
MetaAutomation brings a much higher level of
visibility and respect to the QA role
·
MetaAutomation brings higher visibility to the
developer role
·
MetaAutomation breaks down silo walls with speed
and transparency
·
MetaAutomation shows how a single check can
drive and measure an Internet of Things (IoT) product on multiple tiers
The costs of MetaAutomation? First, take on some paradigm
shifts. These are enumerated on http://MetaAutomation.net.
Second, the quality automation code needs as much care and
detail as product code. The team needs at least one person with software
development skills to be a part of the QA team or role. There are working
solutions available for free on http://MetaAutomation.net.
The diagram below shows how the six patterns of
MetaAutomation compose to form the pattern language, and how they fit in the
context of the business space. The TAF pattern addresses the QA role, but
MetaAutomation delivers value all across the team, even up to the executive
suite for Sarbanes-Oxley (SOX) compliance.
Thursday, February 18, 2016
DevOps needs MetaAutomation for Quality
DevOps emphasizes collaboration and communication between
all members of a software team, including leads, designers, devs, QA and test
people. (See the Wikipedia article on DevOps.)
It needs speed and reliability.
A weak quality practice won’t work – regressing and testing
can take huge amounts of time, and measuring software quality is more difficult
if it’s hampered by outmoded ideas, e.g. “The point of Test is to find bugs,” or
“Test automation does what manual testers do, but faster and more reliably … we
hope.”
MetaAutomation shows the way to the powerful quality
automation that DevOps needs:
1.
The team knows exactly what is measured by the
automated checks, and what is not
2.
With self-documenting hierarchical check steps
that report “pass,” “fail,” or “blocked” for each step, the correctness of
product behavior is clear from the business-facing view and the granular technology-facing view
3.
Since the artifacts (results) of the check runs
are pure data, presentation is completely flexible, and analysis is robust
4.
The manual testing role never has to repeat
tedious steps, so they can do what they’re good at, which is more fun and
satisfying anyway
5.
The QA team delivers quality measurements with
speed, accuracy and precision, earning the respect and esteem of their fellow team
members
6.
Checks can be run across tiers and devices, with
unified artifact results, to make quality measurement simpler, more robust, and
less risky
7.
Check runs scale with resources, so results are
available almost arbitrarily fast, giving transparency and trust to the rest of
the team
Check out http://MetaAutomation.net
for more information, and some complete open-source sample implementations with
reusable components.
Subscribe to:
Posts (Atom)