MetaAutomation: 2012

Tuesday, November 6, 2012

MetaAutomation Grows Up: New, Refined Definition

I'm preparing for my talk tonight at SeaSPIN meeting in Bothell, WA, with details on the site here:

This is the first time I've presented the material in this medium (a one-hour talk) and in preparing, I have a new and more refined definition:

Metaautomation is a meme of well-known practices and interrelated software technologies, used to communicate, plan, and execute on scaling automation up in strength and effectiveness and integrating software quality more effectively into the SDLC.

First-order metaautomation describes technologies applied at automation runtime.

Second-order metaautomation includes techniques applied to artifacts of one or more automated tests at some time after a test run is complete.

If you're in the neighborhood, be sure to vote, then come on by!

Wednesday, October 3, 2012

Metaautomation and the Death of Test, part 2: the Quality Profession

Part 1 of this post is here http://metaautomation.blogspot.com/2012/10/metaautomation-and-death-of-test-part-1.html

One of the reason you want to keep testers around is that their motivations are very different than the devs. Devs want to check in working code, to earn their magnetic ball toys and the esteem of their peers. Testers want to write highly actionable automation – the topic of this blog – and measure quality so the team can ship the code, but especially to find good bugs, to earn their wooden puzzle toys and the esteem of their peers.

Here’s my post on the Metaautomation meme http://metaautomation.blogspot.com/2012/08/the-metaautomation-meme.html for describing how automation can provide the best value for the SDLC (software development lifecycle).

Automation is just part of the quality picture, but an important one. Many years ago, all STE’s at Microsoft were encouraged to become SDETs – i.e. to learn how to automate the product - because Microsoft recognized the importance of having quickly and accurately repeatable regression of product behavior.

Now, if automating the product – make it do stuff repeatedly! – is all there is, then it’s reasonable to suppose that devs can take a little time out of their normal duties to automate the product. But of course, that takes time – sometimes a surprising amount of time – and they have to maintain the automation as well, or just ignore it when it starts failing, which makes it worse than useless.

The idea that all you have to do it simple automation, with no care towards making failures actionable, is myopic IMO (although, attractive perhaps to the business analyst). I address this in more detail here http://metaautomation.blogspot.com/2011/09/intro-to-metaautomation.html.

This post addresses the importance of testing to the SDLC the SDLC http://metaautomation.blogspot.com/2012/01/how-to-ship-software-faster.html.

This post is about managing for strong, actionable automation and looking forward to second-order metaautomation http://metaautomation.blogspot.com/2012/08/managing-metaautomation.html.

More on second-order metaautomation http://metaautomation.blogspot.com/2011/09/metaautomation-have-we-seen-this.html.

Not all of these techniques are completely new. Some are practiced in corners of Microsoft, and (I’m told) Amazon. The metaautomation meme just makes it easier to describe and communicate how far the team can (and in some cases, should) go to make the quality process more powerful.

Metaautomation is the part of the test profession that is expressed in automation.

Are there other labels that people use to describe what I call first- and second-order metaautomation? Please comment below and I will respond.

Metaautomation and the Death of Test, part 1: No Actually, you need Test

There’s a meme going around, mostly out of Google it seems, that “Test is Dead.”

The prototypical example of this is gmail. The special SDLC (software development lifecycle) of gmail, for purposes of this meme, goes like this: devs write some feature. Devs do unit testing and some E2E testing, check in, and move on. The code is deployed to live on some (not all) web servers on the farm. End-users notice issues, end users have nothing else to do so they use the “Report a bug” control on the page to send a bug back to Google, Google receives the bug and the report is well-written with sufficient detail, but not too much, so the bug can be prioritized and potentially fixed. Tada! Testing has been outsourced to customers.

… except that the conditions that must be true for such a scenario to work tightly limit the applicability of this technique. See for example this link, which discusses the security implications of this approach: http://metaautomation.blogspot.com/2011/10/no-you-cant-outsource-quality-detour.html. The end-users must know exactly what to expect from whatever product it is, and they’re not going to read a manual or requirements spec, so the functionality must be a reworking of some well-known idea, say, an in-browser email client or an online game of checkers. No automation is available, so regressions might go undetected for a while and be more expensive to fix than otherwise, and fixing the regression might even break the feature set with code changes that causes the regression in the first place. Clearly, this technique is much too risky for important or mission-critical data e.g. financial or medical data.

But, there’s one idea here that does work and is worth elaborating: devs are accountable to do some degree of E2E testing.

Why is E2E testing important for devs? Can’t they just toss code over the wall, after unit tests pass, and let testers deal with it? After all, that’s their job… but testers have better things to do, which is the topic of part 2 http://metaautomation.blogspot.com/2012/10/metaautomation-and-death-of-test-part-2.html

Imagine if dev implements a feature, sees that unit tests pass, thinks “good enough” and checks it in. Assume that E2E tests are not written for this feature, because hey, it’s a brand-new feature. Build of the product in the repository succeeds. (Assume the team is not doing branch development.) Whoever tests that build first finds some issues, and writes bugs, and puts them on the developer’s plate. The dev eventually sees the bugs, theatrically slaps his/her own forehead, repros the bug and with minimal research, fixes it. If the bug isn’t tended to for a week, this is even more expensive because the code associated with the bug might not be so familiar to the dev, so it would require more research to fix the issue.

It would be MUCH better if the dev tested out the feature first with some E2E scenarios, before the checkin, or have the tester take the changeset (using Visual Studio’s TFS, this is a “shelf set”) and do some testing of the changes, to find the issues before checkin. Why better? Because a) the fix will be quicker to do b) no need to enter bugs for the record, and c) nobody need be hindered by the broken functionality of the issues, because they’re never checked in. Oh, and d) triage doesn’t have to look at the bugs, because there aren’t any reported bugs.

Another useful way to address this is to check in tests for the feature at the same time that the feature is checked in, which means that whoever wrote the E2E tests (probably a tester) combines that changeset with the product feature change. This can save a lot of churn, and the symmetry of checkin in the combined feature and quality tests looks simple and robust. The problem is if the feature automation is not ready when the feature is, and checkin of the feature would be held back. That might slow the dev down and for a complex product, there are likely other stakeholders (dev, test, and PMs) waiting on the changes, so the cost of waiting must be compared to the value of doing a unified dev + test checkin.

Therefore, the dev should be expected by team convention to do some amount of E2E testing of a new feature. How much?

For simplicity of argument, assume for the moment that nobody from Test is involved before checkin.

Too little testing on the dev’s part, and the dev won’t find his/her obvious, blocking bugs. (“Blocking” means that functionality is broken and breaks a scenario or two around the new feature, so some testing and other use of the product is blocked.) Too much, and the feature is delayed, along with other features and other work that depends on the feature.

I asked this question – how much testing by the devs? – of James Whittaker, when he gave a talk last month at SASQAG in Bellevue, Washington.

(Inline references: James’ blog is here http://blogs.msdn.com/b/jw_on_tech/. SASQAG is here http://www.sasqag.org/. )

James’ answer was that it depends on the dev’s reputation for quality. Fair enough, but I’d prefer to start out with formal, uniform expectations and relax them for individuals as they earn the team’s trust:

First, have the test team define repeatable E2E test cases for the new feature being implemented. These test cases are to be used through the SDLC and beyond, so might as well write them earlier in the cycle than they normally are. Give the test cases sufficient detail that anybody who knows the product can run them, and precise enough that distinct bugs are always correlated with different test cases.

Then, have the devs execute the test cases when they think the feature is ready. If the feature is non-GUI (e.g. an SDK API) then maybe the E2E test can be implemented easily too, and the test run that way, before checkin and then afterwards for regression. If it’s a GUI feature e.g. in a web browser, probably the feature can’t be automated before implementation is complete.

I recommend a minimum of two happy-path test cases, one edge case if applicable, and one negative case. It’s expected at project outset that a) a tester writes the test cases before the feature is implemented b) the dev (and maybe the tester too) runs the test cases before checkin.

This will save the whole team a lot of time, but especially the testers… for the good of the product, they should be extremely busy anyway, which is the topic of part 2 of this post. http://metaautomation.blogspot.com/2012/10/metaautomation-and-death-of-test-part-2.html

Thursday, August 30, 2012

The MetaAutomation Meme

The word “Meme” was coined by British evolutionary biologist Richard Dawkins to describe the spread of ideas and cultural phenomena, including cultural patterns and technologies.

Metaautomation describes a set of techniques and technologies that enable a view of software quality that is both deeper and broader than is possible with traditional software automation alone, and given sufficient investment, this can be taken further to do smart automated test retries and even automated triage and pattern detection that wouldn’t be possible with traditional techniques.

For the more advanced metaautomation concepts, the investment and risk are greater, and the potential reward in terms of team productivity are much greater. So, I’m dividing the meme into two parts:

· First-order metaautomation: making test failures actionable, and minimizing the chances that a debugging session is necessary to find out what happened

· Second-order metaautomation: creating some degree of automated triage, automated failure resolution, and automated smart test retry

Metaautomation is an innovation analogous to the simple structural arch: before arches, the span and strength of bridges was limited by tensile strength (resistance to bending) of the spanning material. A famous example of this technology is North Bridge in Concord, Massachusetts.

http://en.wikipedia.org/wiki/Old_North_Bridge

But with arches, the span and strength is limited by the compressive strength of the material used. This works well with another common building material – stone - so the technology allows much more impressive and long-lasting results, for example, the Alcantara Bridge in Spain.

http://en.wikipedia.org/wiki/Alc%C3%A1ntara_Bridge

The techniques of metaautomation did not originate with me, but in defining the term and establishing a meme for the pattern, I hope to make such techniques more understandable and easy to communicate, easier to cost and express benefits for the software process, and therefore more common.

The first order of metaautomation will become very commonly used as the value is more widely understood. The second order of metaautomation is good for large, distributed and long-lived projects, or where data has high impact e.g. health care or aviation systems.

Wednesday, August 29, 2012

Managing MetaAutomation

“If you can’t measure it, you can’t manage it.”

This quote has been attributed to Peter Drucker, Andy Grove, Robert Kaplan, and who knows who else. Oh, and me. I said it, so put me down on the list too.

The common measurement of automation is the number of test cases automated. Since what management measures is what management gets, one result of this practice can be an antipattern:

http://metaautomation.blogspot.com/2011/10/patterns-and-antipatterns-in-automation.html

a product scenario is exercised, probably to completion, but confidence about that completion can be elusive, and in case of any kind of failure, a very significant investment is required of the test developers to follow up and resolve the failure to an action item – which can cause team members to procrastinate on resolving the failure because that’s not what’s being measured, and the behaviors addressed by the failing automated tests get ignored for a time, which in turn causes project risk because the product quality measurement provided by test automation is disrupted.

How does one encourage the correct behaviors to get robust automation with strong, scalable value towards measuring and regressing product quality – and positively measure the team members’ behaviors, too? I’m talking about metaautomation, of course, and how to encourage progress towards metaautomation in output from the team. Here are some thoughts on useful performance metrics towards that end.

Some goals for your team:

· advance the effectiveness of test automation to achieve quick and effective regression detection

· achieve quicker and more accurate triage to keep needless work off people’s plates

· reduce wasted time for everybody on poorly-defined failures

(that is first order metaautomation, the topic of a future post)

… and beyond that, where a deeper investment in quality is warranted, look forward to

· smart automated test retry

· some degree of automated triage

(this is the second order of metaautomation, to be covered in more detailed also in a future post)

I think improving team spirit and cohesion, and improve technical learning in your individual contributors, can be achieved at the same time. In order to get there, measurement of performance in these areas must be combined with other management metrics used for assessing individual performance.

Metaautomation-friendly practices accelerate the test automation rate during the automation project as classes, coding patterns and other structures are put into place. For example: Given two projects, one doing simple minimal automation (call it project A) and the other doing metaautomation to the first order (project B), project A will start out faster but will suffer over time from failed tests that are either neglected, causing blind spots in software quality, or failed tests that take significant investment to get them working again. Project B will eventually overtake project A in rate of successfully running automation, and probably eventually in raw numbers of tests automated. In project B, the quality value of running tests is much greater because the test failures won’t be perceived by the team as time-sucking noise. I covered this topic pretty well in previous posts. All team members need to understand this foundational concept.

So, how do we make metaautomation qualities (in performance of test team members) measurable at test automation time?

First, you can bring the team up to agreed-on code standards. Most projects have preexisting code, so defining the implementing the standards is probably going to be an iterative process.

This can also be a team-strengthening collaborative process. For a large project, have everybody read existing code standards (if they exist) and propose additions or changes - offline to save time. Minimally, everyone will learn the code standards, but much better, they have some ownership in improving the standards, through an email thread or wiki. This shouldn’t take a lot of time, and is a great opportunity for team members to learn team practices and show their ability to contribute to the team while learning how to write more effective, readable, maintainable, metaautomation-friendly code themselves. In Test, this allows them to feel more ownership than testers normally have AND emphasizes team contribution and learning.

Peer code reviews are an even better opportunity for team members to communicate, learn from and influence each other with respect to these coding practices and standards. Just as it’s important for testers to learn the whole project, they benefit from learning the whole team as well, and I advocate that everybody get chances to review others’ work as an optional or required reviewer. This is another opportunity to bring out team players, bring the team together, and give introverts opportunities to reach out with two-way communication and learning. Testers should be encouraged to push for testability in the product code, and qualities of metaautomation – per the earlier team agreement – in test code. Suggestions must be followed up on, not necessarily in the code itself, but it’s important for everybody on the team to recognize that they are all learning and teaching at the same time. No cowboy code allowed!

For example: in the case of discussing a topic for which developer Foo is much more knowledgeable than developer Bar, developer Foo is expected to provide some educational assist to Bar, e.g. a link and some context. Foo and Bar will both benefit from a respectful transfer of information: Foo from the greater understanding that comes through the teaching process (however minimal), Bar form the learning, and both of them from team cohesion.

See what testers can come up with for techniques to improve visibility into the root cause of any one failure – i.e. if a test fails due to some specific failure, is it easy to find root cause of the failure by inspecting output – the artifacts of the failed test case run?

Encouraging everybody to communicate with each other in terms of the code will accelerate learning and improvement all around, and if done right, will improve team cohesion as well. It will also bring out the value of the individual contributors as team players, and since team members will all figure out that this is one thing that management is noticing, they’ll do their best to help each other out and not default to isolation.

I think this is a great opportunity for positive reinforcement from the test lead or manager; not singling out an individual for praise, which can have negative effects on morale, but rather noting and raising the visibility of ways in which the team can achieve things through teamwork, that none of the individuals on the team could achieve. Positive reinforcement is appropriate here because the encouraged behaviors are associated with learning, collaboration, and innovation.

Here are summary steps to strengthen your team using principles of metaautomation:

1. Establish that the pro-metaautomation behaviors described here are expected

2. Encourage and give positive reinforcement at a team level

3. Make measurements of contributions and integrate these measurements with other metrics and expectations used in evaluating performance

Using these as a guide, you can make metaautomation manageable, and lead your team to new strengths in promoting a quality software product.

Thursday, July 19, 2012

The Legacy of Stephen R. Covey: testing with Character, not Personality

Stephen R. Covey, author of the hugely successful book "The 7 Habits of Highly Effective People," died July 16th 2012.

I picked up his book because I realized that I need to nurture my own leadership skills in order to promote product quality at the next level. Working as an individual contributor doesn't cut it anymore; I need to influence the big picture, not patch up automation projects that are already established according to the patterns of software quality as it's generally practiced today.

Serendipity! There it is, in the first full chapter of his book. Covey pithily describes his journey of discovery through American self-help literature as a tension between two approaches toward being personally effective: The "Character Ethic" and the "Personality Ethic."

The Character Ethic is deep and foundational, but not always obvious on the surface behaviors of a person. The Personality Ethic displays on the surface, but doesn’t necessarily have any depth.

Covey dismisses the Personality Ethic approach to increasing personal effectiveness as beneficial in some settings, but weak in the long term and perhaps even damaging, whereas the Character Ethic flows from a person's basic values and practices.

He describes his own paradigm shift, triggered by a parenting challenge with one of his sons. Covey used the personality ethic as taught in his day to guide his parenting style, but he transitioned to the character ethic (as described by Benjamin Franklin) with good effect. The Character Ethic requires taking responsibility, and Covey shows that his paradigm shift is complete when he takes responsibility for his actions re. his son.

I've seen this happen at many software teams: people do software quality by emphasizing a basic automation of the product. The result is N test cases automated, and M of them pass. What happens with the N-M test cases automated that do not pass is not considered important, except that the team recognizes the need to get these test cases running again at some point. The highest priority - I've seen this priority be placed even higher than fixing "broken" test cases - is automating more test cases. The count of test cases automated is a superficial measure of success and productivity in measuring the quality of the product, but it's what they are measured on. Management needs a metric, and this is the best or most usable one they know of.

Covey's shift from personality ethic to character ethic is a pretty good analogy to the shift away from the automate-as-many-test-cases-as-you-can focus, towards the deep automation that really tests the product, and attempts to maximize the actionability of test failures. (The first approach might use a minimal positive-result verification like this: the automated test didn't throw an exception, therefore it's good.)

Focusing on automated test count– that is, the traditional way of automating the product - has value in the near term because it really does make the product do stuff and you can find a lot of product bugs through the process of creating automation. But, in the longer term with this approach, the inevitable failures of these tests are costly to follow up on and often the team gets an inaccurate picture of how completely the quality of the product has been measured because the automated test might hide failures or even be testing the wrong thing. I've even seen a very expensive 3^rd-party test harness report success on a test, when on detailed follow-up, I found that the test didn't do anything meaningful.

Test automation with attention to metaautomation takes some up-front test harness design and more careful test automation with attention to patterns of failure reporting, but in the longer run a better and more complete measurement of product quality happens; test failures due to product issues are more actionable and are likely to be followed up on quickly. Trust in test automation code is higher, which enables the team to be more productive. Most importantly: failures due to regressions in the product are fixed more quickly, which keeps quality moving forward and reduces risk associated with product changes.

A test lead might ask these questions of someone automating a test: In how many different ways might this automated test fail, and what happens when it does fail?

Covey's character ethic, applied to software quality, makes for stronger quality and stronger product character.

Here is a related post on the need for test code quality: http://metaautomation.blogspot.com/2011/09/if-product-quality-is-important-test.html

Wednesday, February 1, 2012

The Risk of Agile, and How to Mitigate

Agile software development process is all the rage these days, and for good reason: by managing the complexity inherent in software projects, minimizing WIP (work in progress), and a bunch of other great values outlined here

http://www.agilemanifesto.org

I’d like to address the principles individually in a following post, but for now, I’ve experienced some real risks that follow directly from the agile process as commonly conceived and realized. I will address some of those risks here.

I’ve seen this in multiple workplaces: the project has been divided into close-knit teams, and the teams work in sprints of e.g. two weeks. The sprint goals and costs are agreed on at the beginning of each sprint. Each team member takes on the sprint goals as they have bandwidth, and at the end of the sprint, the team does some simple demo to show the team’s work.

The general problem is, although these teams are ultimately highly interdependent - they all fit together to make the product! – they each end up developing local solutions to common problems.

Class types, XSD schemas, logging solutions, configuration files – either these are ultimately shared or they should be. If the former, software development risk is pushed back to the end of the cycle when everybody has to figure out how to integrate and work together. If the latter, there’s duplicated work with resulting cost and risk. (Such risk could show up late in the cycle: “I thought you added the streaming log option!” “I did, but over here. You guys developed your own logger, so you’re on your own.”)

The solution: Share resources early. Sorry agilites, this requires a bit of planning. Decide early on what the shared resources are, and be aware of when the need might arise for a resource that might be shared. Put shared resources in a common location…

I’ve seen cases where this is not done, and it’s very expensive. Agile dislikes planning, but without a plan, for a project that involves engineers in the double-digits, people eventually don’t know what’s going on they must guess, and often guess incorrectly. Confused and frustrated engineers can create an amusing setting (for those with a sense of humor about it) but it doesn’t ship quality software on schedule.

And remember: Duplication on information or resources is anathema to quality!

http://metaautomation.blogspot.com/2011/10/duplication-of-information-is-anathema.html

Tuesday, January 31, 2012

How to Ship Software Faster

Software is risky because it’s so difficult. For those outside the business, it’s hard to comprehend how difficult it is – computers are our information slaves, right? Tell them what to do, and they do it. Yes, and although the tools are getting better with time, computers are profoundly stupid. They will very rapidly, decisively, and often irreversibly do the dumb thing if you don’t very carefully tell them not to, and this in turn takes immense attention to detail.

It takes human intelligence to make computers solve a business problem, and humans make mistakes: they overlook stuff, make typos, make assumptions that aren’t always correct, and run into trouble communicating to others.

If a “bug” is an actionable item about the quality of the software, then people who tell computers what to do create bugs constantly, mostly un-characterized and un-measured bugs that will likely haunt the software project at some point in the future.

Teams need testers on the product early on: people who measure the quality of the product in a repeatable way and can report on issues found, and people who focus on the quality of the product rather than on getting the product to do stuff.

Software development is so difficult, it’s become a highly specialized and focused skill. Developers are focused on getting the product to work and meeting deadlines, because they have to be. Asking devs to focus on quality is like asking a race car driver to drive without a riding mechanic in the 1911 Gran Prix auto race. (Keeping a race car of that era running well took a lot of attention to the fuel mix etc.)

If you’re writing a phone game or a perpetual “beta” app, you can do like Google and outsource quality to the end-users, but if you’re doing something other than selling advertising, that won’t work. See this post: http://metaautomation.blogspot.com/2011/10/no-you-cant-outsource-quality-detour.html.

The quality of what a company ships is very important to a company’s reputation and goodwill. Just guessing from the typical software product lifetime, this is good for 5-10 years, meaning that quality issues have very significant impact on a business.

The solution is simple. To manage risk and protect company value, it’s a good investment to have software testers on board. Manual testers are important but not sufficient by themselves due to time and human error; a project needs regular, frequent, and highly repeatable measures of quality with highly actionable artifacts… you guessed it, once again I’m linking back to here to accentuate the increased value that non-GUI automation can bring to a company. http://metaautomation.blogspot.com/2011/10/automate-business-logic-first.html.

If the goal is to ship quality software on-time, a good software quality person helps by providing measures of quality and risk for components of the product and the product as a whole, minimizing schedule risk by finding quality issues early or heading them off with good bugs, and creating better data on the work backlog e.g. necessary refactorings and bug fixes. Smart testers can find business-logic issues, characterize them, and even suggest fixes. This saves the devs time and help them stay focused on the larger picture of implementing the product, which in turn helps ship on time.

Software quality has tremendous value to an organization. Companies can’t afford to short themselves here.