Wednesday, August 5, 2015

Stronger Quality with MetaAutomation: Part 2 of 3, Handling a Check Failure

What happens on your team when a check (what some call “automated test”) fails?

If you follow the common practice of automating manual test cases, then the authors of the automation are expected to follow up and diagnose the problem. Usually, it’s a failure in the automation itself, everybody on the team knows that, so it doesn’t get much priority or respect generally, but even if it does, it’s very time consuming and labor-intensive to resolve the failure.

Alternatively, the author of the automation watches the automation proceed to see if something goes wrong. That shortens the communication chain, but it’s very time-consuming, expensive, and doesn’t scale at all.

Regression tests or checks that are effective towards managing quality risk must be capable of sending action items outside the test/QA team quickly. False positives, i.e., messages on quality issues that ultimately turn out to not concern the product at all, are wasteful and they corrode trust in the test/QA team. Therefore, quality communications must be quick and trustworthy for test/QA to be effective.

On check failure, people can look at flight-recorder logs that lead up to a point of failure, but logs tend to be uneven in quality, verbose, and not workable for automated parsing. A person has to study them for them to have any value, so the onus is on test/QA again to follow up. Bottom-up testing, or testing at the service or API layer, helps, but the problem of uneven log quality remains. Mixing presentation with the data, e.g., English grammar or HTML, bloats the logs.

Imagine, instead, an artifact of pure structured data, dense and succinct, whether the check passes or not. Steps are self-documenting in a hierarchy that reflects the code, whether they pass, fail, or are blocked by an earlier failure.

MetaAutomation puts all of this information in efficient, pure data with a schema, even if the check needs to be run across multiple machines or application layers.

A failed check can be retried immediately, and on failure, the second result compared in detail to the first. Transient failures are avoided, and persistent failures are reproduced. Automated analysis can determine whether the failure is internal or external to the project, and even find a responsible developer in the product or test role as needed.

If so configured, a product dev would receive an email if a) the exact failure was reproduced, and b) the check step, stack trace, and any other data added by check code indicates ownership.

Atomic Check shows how to run end-to-end regression tests so fast and reliably, they can be run as check-in gates in large numbers. Check failures are so detailed, the probability that a problem needs to be reproduced is small.

With MetaAutomation, communications outside test/QA are both quick and trustworthy. See parts 1 and 3 of this series for more information.

Tuesday, August 4, 2015

Stronger Quality with MetaAutomation: Part 1 of 3, Fast Quality

Manual testing and programmatically-driven quality measurements are both very important for software quality, however, they are each good at very different things.

Please notice that I’m avoiding the word “automation” here; there’s good reason for that. I write more on this topic here http://metaautomation.blogspot.com/2015/07/the-word-automation-has-led-us-astray.html.

Freedom from the word “automation,” and liberation from assumptions that programmatically-driving quality measurements should be like manual testing in any way, have similar benefits: They open up new frontiers in productivity for managing quality and risk for your software project.

Imagine focusing on prioritized business requirements, at the software layer closest (if at all possible) to where those business items are implemented. Writing just one check – that is, a programmed verification – per business requirement, makes for simpler, faster checks.

This is one of the reasons I created MetaAutomation: to highlight more effective techniques to programmatically measure and report on the quality of the SUT.

The least-dependent pattern of the MetaAutomation pattern language is Atomic Check. This pattern shows specific techniques to create a focused verification, or verification cluster, on an item of business logic. There are no verifications other than the target verification, i.e., the target of the check, and whatever check steps are needed to get to the target.

Simple, focused checks run faster and contain fewer points of failure. Atomic Check also describes how to make the check independent of all other checks, so that running your set of checks will scale across resources and your selected set of checks can be as fast as you need them to be.

Atomic Check also creates an artifact of the check run in strongly-typed and pure data, e.g., a grammar of XML. This has many benefits, including that it enables another very useful pattern, called Smart Retry, which I’ll write more about in part 2, and addresses visibility, quality analysis, and SOX compliance, which I’ll discuss in part 3.

Tuesday, July 14, 2015

Automation Gets No Respect

The world of software automation for quality is riddled with failures. When the people creating the automation of the software under test (SUT) fail to create reliably-running tests, or it becomes clear that this effort takes more time than first estimated, management and the rest of the team lose confidence. In any case, driving the SUT through scenarios is too often seen as a risky, low-value afterthought. After all, the developers can test the product themselves and learn things that they thought the test team was going to tell them anyway, but for some reason can’t reliably deliver.

In any case, the conventional approach to software automation for quality creates a losing situation for the people doing the work.

If they are told that the highest-value automation is end-to-end automation of the product, including the web page or GUI, they are likely doomed to write a system that creates many false positives – i.e., test failures that have nothing to do with product quality - which in turn create more work for them because they must follow up with a debug session just to discover if there is an actionable piece of information for the rest of the team.

The broader team pays little attention to the results from the checks because they know

1. False positives are common, and if there really was a product bug, the authors of the check would discover that in a debug session and tell them.

2. The checks don’t measure what they’re designed to do, because they can’t possible match the perception and smarts of a human testing the SUT directly.

With the correct focus on verifying and regressing the business requirements of the SUT, rather than on automating the SUT do make it do stuff, the false-positive problem and the what-is-the-check-verifying problem go away. I created MetaAutomation to describe how to take the optimal approach to solving these problems and creating many other benefits along the way:

· The focus is on prioritized business requirements, not manual tests or scenarios

· Checks run faster and scale better with resources

· Check steps are detailed and self-documenting, with pass, fail or blocked status recorded in the results

· Check artifacts are pure data, to enable robust analysis on results and across check runs

· The quality measurement results are transparent and available for sophisticated queries and flexible presentations across the whole team

· Performance data is recorded in line with the check step results

With MetaAutomation, the test and QA role can produce the speedy, comprehensive, detailed, and transparent quality information to ensure that functional quality always gets better.

If you want to give the test/QA role the importance and respect it deserves, try MetaAutomation!

Monday, July 13, 2015

The Word “Automation” Has Led Us Astray

If you’ve written automation for software quality, there’s a good chance you did it wrong. Don’t feel bad; we’ve all been doing it wrong. It’s not our fault.

We were led astray by the word “automation.”

Automation is about automatically driving human-accessible tools to accomplish specific tasks.

When people starting programmatically driving their software system under test (SUT) for quality purposes, overloading the word “automation” seemed like the best way to describe the new practice, but actually the word applies poorly to software quality. “Automating” your SUT means making it do stuff, and that’s usually how management measures output. Quality verifications are added as an afterthought, and have little to do with the “automation” objective so they tend to be poorly planned and documented.

The word “automation” for software quality distracts people from the business value of the activity: measuring quality.

The misunderstanding that automation for software quality is just doing what humans do (i.e. manual testing), but doing it faster and more often, causes business risk:

· Unless you’re very clear and specific on what is being measured, the quality measure is incomplete and manual testers must verify it anyway.

· Manual tests are designed by people to be run by people. They do not make ideal automated measurements because they tend to be long, complicated, and burdened with too many or too few verifications.

Automated verifications of services and APIs tend to be more effective, but this isn’t “automation” either by the definition.

At the core of the paradigm shift is an important dichotomy:

People are very good at

· Finding bugs

· Working around issues

· Perceiving and judging quality

But, they’re poor at

· Quickly and reliably repeating steps many times

· Making accurate measurements

· Keeping track of details

Computers driving the SUT are very good at

· Quickly and reliably repeating steps many times

· Keeping track of details

· Making accurate measurements

But, computers are poor at

· Perceiving and judging quality

· Working around issues

· Finding bugs

Conventional automation for software quality misses these distinctions, and therefore comes with significant opportunity costs.

To create software that matters and be effective and efficient at measuring quality, your team must move away from conventional misguided “automation” and towards a more value-oriented paradigm. To describe this value, I created MetaAutomation.

MetaAutomation is a pattern language of 5 patterns. It’s a guide to measuring and regressing software quality quickly, reliably, and scalably, with a focus on business requirements for the SUT. The “Meta” addresses the big-picture reasons for the effort, the nature of what automated actions can do, and the post-measurement action items and efficient, robust communication where “automation” by itself fails.

MetaAutomation shows how to maximize quality measurement, knowledge, communication and productivity. Imagine e.g. self-documenting hierarchical steps automatically tracked and marked with “Pass,” “Fail,” or “Blocked.” Imagine a robust solution to the flaky-test problem!

For trustworthy, prioritized, fast, scalable, clear and around-the-team presentable quality results, try MetaAutomation!

The other posts on this blog clarify the MetaAutomation pattern language, but unfortunately the topic is also much too big for a blog post, so see also the definitive book on the topic here: http://www.amazon.com/MetaAutomation-Accelerating-Automation-Communication-Actionable/dp/0986270407/ref=sr_1_1

Tuesday, May 12, 2015

Innovative MetaAutomation

Is MetaAutomation actually innovative?

From the Wikipedia entry on “Innovation:”

Innovation is a new idea, more effective device or process. Innovation can be viewed as the application of better solutions that meet new requirements, inarticulated needs, or existing market needs. This is accomplished through more effective products, processes, services, technologies, or ideas that are readily available to markets, governments and society. The term innovation can be defined as something original and more effective and, as a consequence, new, that "breaks into" the market or society.

By this definition, yes, MetaAutomation is very innovative. Here are some ways that MetaAutomation speaks to the Wikipedia entry:

New Idea

What happens after or as a result of an automation run is now explicitly important, with details on why this is true and what can be achieved with strong, focused data on the automation run. It goes far beyond what can be achieved with conventional automation or even automation that leaves no more than a flight-recorder log and a Boolean result as artifacts.

You could even call that the “Meta” of MetaAutomation.

MetaAutomation includes several paradigm shifts:

1. Manual testing and automated testing bring different value to the team, and if done effectively, they will excel in different ways. If automation is focused on what it does well, it has quality value far beyond what the team gets by automating manual test cases.

2. For automation artifacts, a strongly-typed, focused, pure data solution is much better than a flight-recorder log stream: more compact, richer information, presentation can be applied later so it is flexible (this is especially easy with XML and XSL, and optionally XSD as well, all of them W3C standards), and much more powerful in downstream solutions e.g. analysis, flexible reporting and automated communications.

Ever tried to automate parsing of streamed flight-recorder logs, where the format of the log entries includes English grammar and there are few format rules? I have. It’s nearly impossible to do this reliably.

Even given established practices, current investments, and the dissonance of introducing ideas that are very new and different, MetaAutomation (or, something a lot like it) is inevitable for software that matters to people.

More Effective Process

MetaAutomation includes the process or processes by which quality information is communicated and even made actionable and interactive around the team. This really only works when automation is reliable, fast and scalable, clear and actionable.

Without MetaAutomation, there tends to be failed automation that is ignored because it is assumed, usually correctly, that automation failure is not about a product quality issue. If the automation has recently failed, resolution to an action item often must wait until one of the automation authors has time to reproduce the problem and step through to debug the issue, which again is usually nothing to do with actual product quality.

Inarticulated Needs

To my knowledge, the need to have automation be reliable, actionable, self-documenting and interactive throughout the team has not ever been articulated as part of one concept (although, the reliable part has been attempted before at least, and the interactive part exists, depending on team process). Quality information that covers all business-logic behaviors and persists data for efficient and robust analysis over any time span represented by the data? That might exist for specific proprietary applications such as aviation or military, but it’s not publicly available, and in any case, I suspect that MetaAutomation does it better.

My last blog post addressed the connection between MetaAutomation and Sarbanes-Oxley (SOX) compliance, and in turn, company valuation (for a publicly-traded software company). This is a real value, but AFAIK it has not been articulated before.

Readily Available

MetaAutomation is a pattern language, a series of five design patterns with a defined dependency. It’s language-independent, although the freely-available sample implementation is written in C#, as is the library and sample that is nearly complete. The upcoming lib puts performance information into the XML elements that describe the steps of the check, and is distributed so can run a single check against multiple (virtual) machines.

The book defines the patterns with UML diagrams, clear reasoning, references etc. and describes the sample code.

Original

The closest analog I can find to MetaAutomation is the book “xUnit Design Patterns” by Meszaros. This book is useful, and I reference it from the MetaAutomation book, but there are major differences as well:

xUnit Design Patterns is a survey of existing practices. It does a nice job of describing the patterns and in some cases connecting them together, but the patterns are not original with Meszaros. It’s a large set of patterns, most of which I would not recommend following for specific engineering and quality reasons I won’t go into here.

Some of the patterns of MetaAutomation (e.g. Precondition Pool, Parallel Run) are related to existing patterns and practices in software engineering. Smart Retry is related to a Retry pattern in test automation practice, but MetaAutomation is able to do it much more efficiently because it depends on Atomic Check, the least dependent and possibly most original pattern of MetaAutomation.

The pattern language is very original, and it’s necessary to show the dependency. With Atomic Check, you can achieve a strong return in the value of your quality measurements, but also, if you choose, you and your team can go on to implement the other patterns of MetaAutomation.

Breaks

MetaAutomation is not incremental, nor additive. It’s not a tool you can buy.

I’m not saying that existing automation practices don’t add value; in most cases, they do.

I am saying that the value of MetaAutomation can only be achieved by approaching automation differently from conventional practices. The adoption bar is high because MetaAutomation might require you to start over with automation for software quality.

Innovation

For me, MetaAutomation has been a labor of love. I love software that works, and I love addressing big-picture issues with intensity, tenacity and intellectual courage. It hasn’t been easy, and sometimes my personality pushes people away, but I’m proud to think differently.

Monday, May 11, 2015

MetaAutomation and Sarbanes-Oxley (SOX)

SOX is about accuracy and transparency in accounting and protection for investors in publicly-traded companies. The standards of SOX were enacted as US federal law in 2002.

Remember the collapse of Enron? SOX prevents that kind of thing.

It was such a great idea that it was subsequently imitated in many other countries around the world.

SOX Title III is on corporate responsibility, including the accuracy and validity of corporate financial reports. Section 302 (in Title III) mandates a set of “internal controls” which in turn have requirements for timeliness, accuracy and completeness of internal communications at a company about assets and operations.

SOX Title IV is on financial disclosures, and requires internal controls assuring accuracy and completeness. Section 404 focuses on risk assessment and disclosure of the effectiveness of a company’s internal controls.

At a software company, or a company that creates software as part of the business, these controls are part of the company’s information technology controls or IT controls.

MetaAutomation creates very strong stories for risk management through:

1. Complete, detailed and accurate assessments of software product quality, focused on business requirements of the system

2. Actionable quality events around regressions, found and delivered fast enough to prevent or quickly fix failures found by automated testing

3. A very detailed, searchable and presentable record of software quality that uniformly spans time and all the business behaviors of the product that are accessible to automated testing

The “…timeliness, accuracy and completeness of internal communications…” on quality issues of software development assets is assured with MetaAutomation, to a greater degree than possible with any kind of automation that only creates English-grammar flight-recorder logs. For developing software, on the quality side, Section 302 is covered!

For “…risk assessment and disclosure…” same thing. Visibility and interactivity with the quality data is very high. Section 404 is covered, too!

MetaAutomation reduces the cost of SOX compliance while improving corporate governance. Research has shown that this has a significant positive effect on company valuation (see This paper, and if you don't have access, it's easy and free to sign up.)

Quote from the paper: “The overall regression results are consistent with the view that SOX has a favourable long-term favourable impact.”

The adoption costs of MetaAutomation are not trivial, but improved company valuation is potentially quite significant.

Saturday, April 18, 2015

The valid concerns of Dorothy Graham, and how MetaAutomation solves them

This is a brief post to describe how MetaAutomation solves existing and well-known problems in automation for software quality. Below are two examples.

There’s an interview of Dorothy Graham published April 10^th:

http://www.stickyminds.com/interview/blunders-test-automation-interview-dorothy-graham

In the interview, Graham talks about her extensive experience with testing in general and automation in particular.

Example 1:

The interviewer (Josiah Renaudin) asks her about the differences between manual and automated testing, Graham tells a story about a hypothetical test failure:

“Oh! Something failed. Test thirty seven failed. Okay, what was test thirty seven doing? You have no idea what's going on and you have to build that context before you can start looking at it and seeing whether it actually is a bug or isn't…. This is something which a lot of people don't take into account when they're thinking about automation, is that failure analysis can take a lot longer with automated testing than with manual testing.”

This is a very important problem, and MetaAutomation has a very complete solution to the problem. If you follow the Atomic Check pattern, you have atomic checks that are self-documenting with compact, structured artifacts. So, the number “thirty seven” doesn’t matter anymore, because the checks are atomic, and the question “what was it doing? What was the error” is answered in great detail in the compact, detailed, presentation-free artifact of the check run.

Example 2:

Graham says:

"You don't want to have automated tests, you want to have automated testing, and the more things around the execution of tests that you can automate, the better."

MetaAutomation is a language of five patterns, which serve as a guide to show the tremendous business value that can be achieve by implementing these patterns and accentuating the value of your Atomic Check implementation. In effect, this is exactly what Graham is talking about: automating not just the checks themselves, but analysis, communications, and other action items around the automation you’re doing for software quality.

Monday, March 23, 2015

Quick summary of MetaAutomation, the Origin and the Importance

This post is a brief explanation of the origin and importance of MetaAutomation.

Please see earlier posts for a dependency diagram of the patterns of MetaAutomation.

MetaAutomation began as an effort to stop the wasteful and expensive practice, common to conventional automation for quality, of dropping important and actionable bits on the floor. The waste, delay, and impact on team trust and communication happen every time an automated test fails and the root cause is not immediately clear.

There are three common but unproductive practices to fix at the same time:

1. The exclusive search for bugs, at the expense of the larger quality picture, provides simple measures of productivity but neglects the end-user experience and the larger picture of quality measurement.

2. The common practice of creating automation from manual test cases tends to cause slow, brittle automation and a failure to prioritize quality measurements by business behavior.

3. The relevance and/or meaning of running conventional automation must be manually interpreted and communicated to the larger team, whether it passes or fails. If this communication does not happen effectively, the larger team loses trust in what the automation is doing for product quality.

MetaAutomation addresses bottom-up and end-to-end automation to measure product quality. It is a pattern language of five patterns that describe language-independent and platform-independent approaches to fix these problems and create fast, scalable, focused verifications of prioritized business behaviors, without the randomization of false negatives or false positives, with compact and robust artifacts for analysis, and actionable results directed to the people who need to know. It is a pattern language, rather than a simple set of patterns, because the five patterns have a dependency order that suggests in turn an order of implementation.

The first and least-dependent pattern of MetaAutomation is called Atomic Check. The name “check” is important because the verifications are explicitly planned and limited, rather than being open-ended as with a manual test. “Atomic” indicates that the procedure is as small and simple as possible, while still verifying the targeted business behavior and being independent of all other atomic checks.

The next pattern, Precondition Pool, manages preconditions, states, or other entities that are needed for fast and scalable check runs. This depends on another requirement of Atomic Check, that is, that the obsolete setup and teardown patterns in automated tests are eliminated. The check itself contains preliminary steps if they are needed in line with the body of the check, but if they can be managed outside of the check timing and process, they happen as part of the Precondition Pool implementation. For embedded systems, a common task for Precondition Pool is to maintain users and user settings. Doing this as part of a process that is independent of the check run allows the checks to run significantly faster.

The Parallel Run pattern depends on the checks being small, so they can be run quickly, and independent of all other such checks. Parallel Run addresses the means to scale the checks across any number of independently-running environments, so that they can run simultaneously and scale with resources.

Smart Retry addresses the many one-off failures that occur with automated tests, especially with external resources or race conditions external to the system under test (SUT). The precise and detailed artifacts from a run of an atomic check allow Smart Retry to decide, based on configuration of the automation system for the SUT, whether a check is automatically re-tried, and on completion, whether a given check result has been duplicated. This approach eliminates check failures that are not actionable, i.e. randomization of team members’ time, and improves the quality of, and trust in, the data that automation is creating to define the quality of the SUT.

Automated Triage depends on the precise and detailed data that the Atomic Check pattern requires of the checks, and with a system that includes data on the scope of the SUT and team members’ areas of responsibility. This pattern represents an important part of the “meta” of MetaAutomation: communications with action items from running automation on the SUT can be automatically directed, in effect, themselves automated. This increases the precision and detail of communications around the team on current quality issues, while eliminating some of the manual work that would otherwise be required.

Some aspects of what Automated Triage can do for the team are unnecessary however, if the new approach to quality regression that MetaAutomation offers enables the automated tests to run so much faster that they can gate check-ins to the team code repository. Gated check-ins prevent regressions in quality from impacting the larger team.

In case the techniques of MetaAutomation are applied to mainline code that is visible to the whole team, it automatically characterizes and reproduces quality regressions and therefore greatly accelerates the fixes to these issues, which again minimizes the impacts of breakages on the larger team.

Common failure patterns of automation efforts can now be avoided. Automation of bottom-up and end-to-end tests can now be part of the success story of ensuring that the quality of the SUT is always getting better; what works correctly for the SUT is well-understood and well-communicated. Finding bugs is still important, but now bugs take their appropriate place in the overall quality assessment of the SUT.

Performance data is now neatly persisted as well; if recorded, it is now stored as part of the compact, strongly-typed and structured artifacts, as part of running the usual automation. No special testing is needed, and more data on performance is available for analysis and over a longer part of the development cycle for the SUT.

Manual testing is accelerated, made more productive and less boring, because manual testers now have access to detailed data on verifications for core business behaviors, meaning that the manual testers never have go to steps to verify these details and are instead free to do more exploratory testing and find more bugs.

The power, simplicity and flexibility of MetaAutomation make it into an inevitable part of modern quality measurement for impactful products. For more information, the book on MetaAutomation is available on Amazon.

Monday, February 23, 2015

Don't Drive Around in a Horseless Carriage, and Don't Automate Manual Tests!

One of the memes doing the rounds these days is that automated testing isn’t different from manual testing, and in fact, you should simplify by thinking of these two modes of quality measurement in the same way.

That way of thinking is very self-limiting and therefore expensive. I will illustrate this with an analogy.

Back in the 1890’s, this was a stylish way to get around:

Then, this happened:

And, people could ride this:

Notice some things about this gasoline-powered car:

The rear wheels are bigger than the front.

One steers the thing with a tiller.

It looks a lot like the horse-drawn buggy that woman is driving, doesn’t it?

But, when people say you should think of automated tests just like manual tests, and even say that the word “automation” should be removed as a meaningless identifier, they’re well-intentioned, but the real consequence of what they’re saying is that they want you to get around town in something like this motorcar from a century ago.

Meanwhile, what you really want to be driving is something like this Tesla Model S:

In the last century, people have figured out that with modern infrastructure in modern society, it doesn’t make sense to drive around in something that looks like a horse belongs in front of it.

I submit that it’s time to do test automation that’s optimized for automated processes, and not approach it as if it were the same as manual testing.

The Atomic Check pattern, the least dependent pattern in the MetaAutomation pattern language, does exactly this. To learn more, check out other entries in this blog or see my book on MetaAutomation on Amazon.

Monday, February 9, 2015

Automation is key, but don't automate manual tests!

A common pattern in software quality these days is to consider automation as if it were the equivalent of manual testing. With this view, the question of “what to automate” is easy; the team proceeds to automate the existing test cases. These were written as manual test cases, and probably run occasionally too or maybe just as part of developing the test cases. It’s thought that the manual test cases don’t have to be a burden any more on the manual testers, because once automated, the quality measurements that is the subject of those test cases is now taken care of. But, is it safe to assume this?

An important difference is that testers notice stuff; they’re smart and observant. Automated tests, on the other hand, only “notice” what is explicitly coded into the automated test, that is, other than the application crashing or some confusing implicit verifications e.g. a null reference exception.

Manual tests of a GUI or a web page can be automated, but the value of running that test as automation is very different than running it manually. The automated test might be faster, but it misses all sorts of details that would be very obvious to a human tester, and is prone to a brittleness that that manual test does not suffer. An experienced team would run the “automated” manual test periodically anyway, to confirm all those qualities that the automation probably doesn’t cover (or, if it did cover, would become too brittle to be useful).

It doesn’t make sense to automate manual tests. But, quick regression is important to manage the risk around product churn. So, what to automate?

A simple but deterministic approach is to verify elements of business behavior, and only the intermediate steps as needed. This way, the automation is faster and more robust and is clear about what it verifies and what it does not.

This is the first of several important requirements of the Atomic Check pattern, the least dependent pattern of MetaAutomation. There are other wonderful properties of Atomic Check, but this one is summarized in an activity diagram on page 37 of the book on MetaAutomation (and shown below).