Friday, May 3, 2013

The Software Quality Process, part 3 of 3: Quality Characterization, Bugs, and When to Ship


The software business requires shipping at some point, and your team and business probably started out with a ship date or target for the product.

Here’s where quality considerations become really important: you do NOT want your customers to discover serious issues with your software before you do. If they do, it could be very damaging to your business. Depending on your priorities, you might consider delaying ship for adequate testing and QA... of course, if you read and can follow my previous posts, you won’t need to delay J

All software ships with unfixed bugs. (Show me software that has no bugs, and I’ll show you software that hasn’t been tested enough.) You can’t be 100% certain that end-users (or blackhats) won’t find serious issues with your software that would cause you regret, but you can do some things to minimize your risk:

Ensure that you hire good people for Test early in the product cycle, and give them the opportunity to do their best work.

Have Test present at requirement and design meetings from product start. Their role is to make sure that the product is testable, and to minimize the many risks of building and shipping a product.

Make sure that the Test Plan is addressed completely, and updated as needed along the way.

When development is complete, all significant bugs have been addressed, and you’re approaching ship time, take a week or three to exercise the product thoroughly and make sure that all product issues that might need fixing or that impact product quality from any perspective are addressed with bugs, and the bugs gets triaged. Probably, most or all bugs will be deferred to a patch or service pack, but the important thing is that you have confidence that there aren’t serious issues that might impact customers but that are unknown to the team. Go through the test plan and make sure that all areas of product quality have been measured, as completely as you can in a timeframe that’s reasonable for your business.

… if after that, there are no bad surprises, it’s ship time!

Links to previous installments of this short series:
http://metaautomation.blogspot.com/2013/04/the-software-quality-process-part-1-of.html 
http://metaautomation.blogspot.com/2013/05/the-software-quality-process-part-2-of.html

 

Wednesday, May 1, 2013

The Software Quality Process, part 2 of 3: Triaging bugs, and bug flow


Think of bugs as atoms of actionable communication that flow around the group with their messages about product quality. They speak only of the product, the customer, engineering and design details.

Triage is about prioritizing the bugs and assigning them to people if as needed to ensure that the bugs keep flowing. Triage can be done on all bugs that haven’t been triaged yet, or all bugs assigned to certain people, or all new and active bugs that aren’t assigned to anybody.

A leader in the group or a meeting  with enough knowledge of the product, engineering issues around the product, end-users, any intermediate customers and other context to make edits if needed to the severity (how important is the issue?) and priority (which issues should be addressed first?). Bugs get assigned to developers for fixing, or program managers for design considerations, or back to testers for more information or as “won’t fix” or postponed as needed.

Test and dev need to work more closely when Test finds blocking bugs that prevent major scenarios from working, or regression bugs that block scenarios that were working before.

Test is always responsible for removing bugs from the flow by verifying and closing them. In some groups, testers are the only ones that can create bugs, but it’s generally OK for anybody to create bugs as long as they’re good bugs.

See part 1 for what makes a good bug:

Tuesday, April 30, 2013

The Software Quality Process, part 1 of 3: Creating docs and bugs


This is post 1 of a series on software QA and Test, seen from the process perspective. Links to parts 2 and 3 will be added here as I finish writing and posting those parts. I use the term “Test” with a capital T to mean the test and QA org, the person or people responsible for measuring and communicating quality.

Software is about telling computers what to do with information. The scope of these posts is about the pure information part of that, so I’m skipping over hardware-related issues, but methods described here could be applied to hardware + software systems as well e.g. the mobile-device business.

Early in the software development life cycle (SDLC) Test needs to be involved, to listen and learn, but also to influence. Important questions for Test to address include: is the product testable? Can the team arrive at a good-enough measure of quality soon enough to ship? Where are the biggest risks? Where would the customer pain points likely be, and can these be mitigated with design decisions that are made early in the SDLC?

One product of these meetings is the Test Plan. The Test Plan needs to include either a link or links to a readable, maintainable, high-level view of the product, probably graphical for efficiency, or include a product description itself at a high level – but not both! The goal here is to have enough information that a new team member can quickly figure out what’s going on without being a burden, and there’s something that people can quickly refer to, but to minimize duplication of information, be as agile as makes sense for the product space, and not to spend too much time documenting or making pretty diagrams.

The Test Plan would continue with a strategy for characterizing the quality of the product with sufficient confidence that it becomes OK to ship. There’s much more to this than just finding bugs. the Test Plan must address ALL aspects of the product that can affect quality in any way, including security issues, integration, installation, deployment, scale, stress, usability, discoverability, and so on. More on this “characterizing quality” think with part 3 of this series to follow.

The Test Plan should contain scenarios to describe the product from the perspective of the end-user, or the client business, or services on which it depends, etc. It could contain test cases, too, but a more agile approach is to have the test cases exist as self-documenting entities in the source code. Modern build systems can generate simple docs from the code that are accessible to decision makers on the team even if they don’t have (or choose to build for themselves) access to the actual Test source code.

The Test Plan is generally the first important product from Test for communicating around the software team what Test is up to, and provides a framework for the characterization of the product quality I’ll address in part 3. The rest of this post is about bugs…

Bugs are created as necessary to communicate quality issues around the team. Here are some qualities of good bugs:

·         The title is succinct, descriptive, and includes keywords (to be searchable)

·         The bug is atomic, so addresses just one fix in one code change or change set

·         The bug is clear and has enough detail for the intended audiences, primarily program managers and developers, but also other people in Test and executives

·         The bug has screenshots if that helps at all with making the bug understandable, e.g.

o   A screenshot of a GUI as seen by product end-user

o   A screenshot of a section of an XML document, formatted and colorized per the IDE used in the developer team, if the bug relates to that XML

o   A screenshot of some specific code as seen in the IDE

·         Links to related, dependent bugs, or bugs on which this bug depends

·         Links to tests cases and/or test case failures if those are maintained in the same database

·         Links to documents

That could potentially add up to a lot of work from Test, and very big and detailed bugs. Watch out for too much detail, though; one of the risks of creating these bugs is that it could be too specific, when the problem that the bug reports is part of a larger problem. Addressing the bug as a sub-problem when the bigger issue gets unrecognized risks losing track of the issue, which could create rework and/or product quality risk.

Bugs are work items that have state, e.g.

·         New

·         Active

·         In progress

·         Fixed

·         Postponed

·         Duplicate

·         Won’t fix

·         Closed

As they “bounce” around the group to be handled by team members asynchronously i.e. when the time is most efficient for them. But, they also create a searchable record of product quality issues, a record of which areas Test has been working on the product, and inform the decision of when to ship product.

The product will ship with unfixed bugs! (If not, it hasn’t been tested, and it probably shouldn’t ship at all.) This will be addressed in the following posts.

There are two more posts to this series:

The Software Quality Process, part 2 of 3: Triaging bugs, and bug flow
http://metaautomation.blogspot.com/2013/05/the-software-quality-process-part-2-of.html
The Software Quality Process, part 3 of 3: Quality Characterization, Bugs, and When to Ship
 http://metaautomation.blogspot.com/2013/05/the-software-quality-process-part-3-of.html
 

Wednesday, April 10, 2013

For Your Quality Customers, Add Value with Every Change


Who are your customers?

Of course, you’re developing software for the end user. The other members of your team are the first customers though.

I’ve written about many techniques of advanced software quality that can reduce risk and strengthen your quality story. This is how the Test team can best add value and make the Devs more productive, meaning that you can ship faster and with lower risk!

To inspire trust and reduce risk to the team, every change set that goes into the product must add quality value, that is, information about the quality of the product.

The problem is, very few software projects are starting anew. Most have some quality infrastructure, maybe some copy/pasted scripts, maybe a set of test cases that are run manually. The team members are customers of this existing infrastructure.

So, existing quality assets such as these must be maintained or replaced. At every change to documents are code, value is added, never taken away.

This is important for the same reason that failures in test code, or failures that are perceived as being due to test failures, must be fixed ASAP: the quality knowledge of the product must always advance and improve. If it does not advance, due to dropped coverage from “old” test infrastructure or tests that fail so often they’re perceived as not worth fixing, then parts of the product are not tested anymore, and knowledge of and stability of the product is lost. This kind of project rot must be avoided.

Every change and every addition to product quality infrastructure, no matter how sophisticated, agile, actionable, self-reporting etc. must add to existing knowledge of the product.
This makes a strong and productive team: mutual respect and attention to keeping the quality moving forward.

Tuesday, April 9, 2013

An Organization and Structure for Data-Driven Testing


This post follows up on the one from yesterday:


So, data-driven testing is the way to go for a huge return on finding and regressing product issues and measuring the whole quality picture. How to start?

I like XML files to do this. Here are some reasons:

1.       Your favorite text editor will work for reviewing, editing, and extending your test set.

2.       If given values for a test are optional and you provide defaults as needed, the XML can be “sparse” and even easier to read and edit. The data that drives the test also expresses the focus and reason for the test, in the data itself!

3.       You can be as constrained or as loose with the data schema (the layout of the XML data) as you want.

4.       Extending your data engine can be as simple as allowing and parsing different values. For example, for testing with objects that include pointers or references, you can put “null” as a value in your XML and have your engine parse and use that for the test, in the context as defined in the XML.

There are many engines that help with data-driven tests, or with some time and skill, you can write your own.

To make the tests more readable and extensible, use different XML files to drive different kinds of tests – e.g. positive vs. negative tests, scenario A vs. scenario B, vs. scenario C. With appropriate labels, comments, error messages and bug numbers inline with the data for the individual test, all your tests can be self-documenting and even self-reporting, freeing you from maintaining documents with details about the tests and removing that source of errors and potential conflicts.

A relational database is a more powerful way of handling large amounts of structured data. This would be a better choice for example if you were doing fuzz testing by generating large numbers of tests, according to your randomization scheme, and then saving to and executing from a SQL database. Even with fuzz testing, it’s very important that tests be as repeatable as possible!

 

Monday, April 8, 2013

The Power of Data-Driven Testing


This post assumes a focus on integration and end-to-end testing of the less-dependent parts of a product, where the greatest quality risks are found: in the business logic, data or cloud layers. See this post for a discussion of why this is most effective for a product that has important information: http://metaautomation.blogspot.com/2011/10/automate-business-logic-first.html

Automated testing usually involves some inline code in a class method. A common pattern is to copy and paste code, or create test libs with some shared operations and call the libs from the test method. The tests correspond to the methods 1:1, so 50 automated tests look like 50 methods on a class with minor hard-coded variations between repeated patterns in code.

For repeated patterns like this, there’s a much better way: data-driven testing.

Data-driven tests use a data source to drive the tests. Within the limits of a pattern of testing as defined by the capabilities of the system reading the data to drive the test, each set of data for the pattern drives an individual test. The set of data for each test could be a row in a relational database table or view, or an XML element of a certain type in an XML document or file.

Why is this better?

For one, agility. The test set can be modified to fit product changes with changes in the test-driving data, at very low risk. It can also be extended as far as you want, within limits described by how the data is read.

Helping the agility comes readability, meaning that it’s easy for anyone to see what is tested and what is not for a given test set. It’s easy to verify that the equivalence classes you want covered are represented for a given set, or the pairwise sets are there, boundaries are checked with positive and negative tests, etc. for a given test set.

To help readability, you can put readable terms into your test-driving data. Containers can have “null” or an integer count or something else. Enumerated types can be a label used in the type, say “Green,” “Red” or “Blue”, or the integer -1 or 4 for negative limit tests.

Best of all, failure of a specific test can be tracked with a bug number or a note, for example, “Fernando is following up on whether this behavior is by-design” or “Bug 12345” or a direct link to the bug as viewed in a browser. When a test with a failure note like this fails, the test artifacts will include a note, bug number, link or other vector that can significantly speed triage and resolution.

The next post


Has some notes on organization, structure and design for data-driven tests.

Tuesday, November 6, 2012

MetaAutomation Grows Up: New, Refined Definition

I'm preparing for my talk tonight at SeaSPIN meeting in Bothell, WA, with details on the site here:
 
 
This is the first time I've presented the material in this medium (a one-hour talk) and in preparing, I have a new and more refined definition:
 
Metaautomation is a meme of well-known practices and interrelated software technologies, used to communicate, plan, and execute on scaling automation up in strength and effectiveness and integrating software quality more effectively into the SDLC.
First-order metaautomation describes technologies applied at automation runtime.
Second-order metaautomation includes techniques applied to artifacts of one or more automated tests at some time after a test run is complete.

If you're in the neighborhood, be sure to vote, then come on by!
 

Wednesday, October 3, 2012

Metaautomation and the Death of Test, part 2: the Quality Profession



One of the reason you want to keep testers around is that their motivations are very different than the devs. Devs want to check in working code, to earn their magnetic ball toys and the esteem of their peers. Testers want to write highly actionable automation – the topic of this blog – and measure quality so the team can ship the code, but especially to find good bugs, to earn their wooden puzzle toys and the esteem of their peers.

Here’s my post on the Metaautomation meme http://metaautomation.blogspot.com/2012/08/the-metaautomation-meme.html for describing how automation can provide the best value for the SDLC (software development lifecycle).

Automation is just part of the quality picture, but an important one. Many years ago, all STE’s at Microsoft were encouraged to become SDETs – i.e. to learn how to automate the product - because Microsoft recognized the importance of having quickly and accurately repeatable regression of product behavior.

Now, if automating the product – make it do stuff repeatedly! – is all there is, then it’s reasonable to suppose that devs can take a little time out of their normal duties to automate the product. But of course, that takes time – sometimes a surprising amount of time – and they have to maintain the automation as well, or just ignore it when it starts failing, which makes it worse than useless.

The idea that all you have to do it simple automation, with no care towards making failures actionable, is myopic IMO (although, attractive perhaps to the business analyst). I address this in more detail here http://metaautomation.blogspot.com/2011/09/intro-to-metaautomation.html.

This post addresses the importance of testing to the SDLC the SDLC http://metaautomation.blogspot.com/2012/01/how-to-ship-software-faster.html.

This post is about managing for strong, actionable automation and looking forward to second-order metaautomation http://metaautomation.blogspot.com/2012/08/managing-metaautomation.html.


Not all of these techniques are completely new. Some are practiced in corners of Microsoft, and (I’m told) Amazon. The metaautomation meme just makes it easier to describe and communicate how far the team can (and in some cases, should) go to make the quality process more powerful.

Metaautomation is the part of the test profession that is expressed in automation.

 

Are there other labels that people use to describe what I call first- and second-order metaautomation? Please comment below and I will respond.

Metaautomation and the Death of Test, part 1: No Actually, you need Test


There’s a meme going around, mostly out of Google it seems, that “Test is Dead.”

The prototypical example of this is gmail. The special SDLC (software development lifecycle) of gmail, for purposes of this meme, goes like this: devs write some feature. Devs do unit testing and some E2E testing, check in, and move on. The code is deployed to live on some (not all) web servers on the farm. End-users notice issues, end users have nothing else to do so they use the “Report a bug” control on the page to send a bug back to Google, Google receives the bug and the report is well-written with sufficient detail, but not too much, so the bug can be prioritized and potentially fixed. Tada! Testing has been outsourced to customers.

… except that the conditions that must be true for such a scenario to work tightly limit the applicability of this technique. See for example this link, which discusses the security implications of this approach:  http://metaautomation.blogspot.com/2011/10/no-you-cant-outsource-quality-detour.html. The end-users must know exactly what to expect from whatever product it is, and they’re not going to read a manual or requirements spec, so the functionality must be a reworking of some well-known idea, say, an in-browser email client or an online game of checkers. No automation is available, so regressions might go undetected for a while and be more expensive to fix than otherwise, and fixing the regression might even break the feature set with code changes that causes the regression in the first place. Clearly, this technique is much too risky for important or mission-critical data e.g. financial or medical data.

But, there’s one idea here that does work and is worth elaborating: devs are accountable to do some degree of E2E testing.

Why is E2E testing important for devs? Can’t they just toss code over the wall, after unit tests pass, and let testers deal with it? After all, that’s their job… but testers have better things to do, which is the topic of part 2 http://metaautomation.blogspot.com/2012/10/metaautomation-and-death-of-test-part-2.html 

Imagine if dev implements a feature, sees that unit tests pass, thinks “good enough” and checks it in. Assume that E2E tests are not written for this feature, because hey, it’s a brand-new feature. Build of the product in the repository succeeds. (Assume the team is not doing branch development.) Whoever tests that build first finds some issues, and writes bugs, and puts them on the developer’s plate. The dev eventually sees the bugs, theatrically slaps his/her own forehead, repros the bug and with minimal research, fixes it. If the bug isn’t tended to for a week, this is even more expensive because the code associated with the bug might not be so familiar to the dev, so it would require more research to fix the issue.

It would be MUCH better if the dev tested out the feature first with some E2E scenarios, before the checkin, or have the tester take the changeset (using Visual Studio’s TFS, this is a “shelf set”) and do some testing of the changes, to find the issues before checkin. Why better? Because a) the fix will be quicker to do b) no need to enter bugs for the record, and c) nobody need be hindered by the broken functionality of the issues, because they’re never checked in. Oh, and d) triage doesn’t have to look at the bugs, because there aren’t any reported bugs.

Another useful way to address this is to check in tests for the feature at the same time that the feature is checked in, which means that whoever wrote the E2E tests (probably a tester) combines that changeset with the product feature change. This can save a lot of churn, and the symmetry of checkin in the combined feature and quality tests looks simple and robust. The problem is if the feature automation is not ready when the feature is, and checkin of the feature would be held back. That might slow the dev down and for a complex product, there are likely other stakeholders (dev, test, and PMs) waiting on the changes, so the cost of waiting must be compared to the value of doing a unified dev + test checkin.

Therefore, the dev should be expected by team convention to do some amount of E2E testing of a new feature. How much?

For simplicity of argument, assume for the moment that nobody from Test is involved before checkin.

Too little testing on the dev’s part, and the dev won’t find his/her obvious, blocking bugs. (“Blocking” means that functionality is broken and breaks a scenario or two around the new feature, so some testing and other use of the product is blocked.) Too much, and the feature is delayed, along with other features and other work that depends on the feature.

I asked this question – how much testing by the devs? – of James Whittaker, when he gave a talk last month at SASQAG in Bellevue, Washington.

(Inline references: James’ blog is here http://blogs.msdn.com/b/jw_on_tech/. SASQAG is here http://www.sasqag.org/. )

James’ answer was that it depends on the dev’s reputation for quality. Fair enough, but I’d prefer to start out with formal, uniform expectations and relax them for individuals as they earn the team’s trust:

First, have the test team define repeatable E2E test cases for the new feature being implemented. These test cases are to be used through the SDLC and beyond, so might as well write them earlier in the cycle than they normally are. Give the test cases sufficient detail that anybody who knows the product can run them, and precise enough that distinct bugs are always correlated with different test cases.

Then, have the devs execute the test cases when they think the feature is ready. If the feature is non-GUI (e.g. an SDK API) then maybe the E2E test can be implemented easily too, and the test run that way, before checkin and then afterwards for regression. If it’s a GUI feature e.g. in a web browser, probably the feature can’t be automated before implementation is complete.

I recommend a minimum of two happy-path test cases, one edge case if applicable, and one negative case. It’s expected at project outset that a) a tester writes the test cases before the feature is implemented b) the dev (and maybe the tester too) runs the test cases before checkin.

This will save the whole team a lot of time, but especially the testers… for the good of the product, they should be extremely busy anyway, which is the topic of part 2 of this post. http://metaautomation.blogspot.com/2012/10/metaautomation-and-death-of-test-part-2.html 

Thursday, August 30, 2012

The MetaAutomation Meme


The word “Meme” was coined by British evolutionary biologist Richard Dawkins to describe the spread of ideas and cultural phenomena, including cultural patterns and technologies.

Metaautomation describes a set of techniques and technologies that enable a view of software quality that is both deeper and broader than is possible with traditional software automation alone, and given sufficient investment, this can be taken further to do smart automated test retries and even automated triage and pattern detection that wouldn’t be possible with traditional techniques.

For the more advanced metaautomation concepts, the investment and risk are greater, and the potential reward in terms of team productivity are much greater. So, I’m dividing the meme into two parts:

·         First-order metaautomation: making test failures actionable, and minimizing the chances that a debugging session is necessary to find out what happened

·         Second-order metaautomation: creating some degree of automated triage, automated failure resolution, and automated smart test retry

 

Metaautomation is an innovation analogous to the simple structural arch: before arches, the span and strength of bridges was limited by tensile strength (resistance to bending) of the spanning material. A famous example of this technology is North Bridge in Concord, Massachusetts.


But with arches, the span and strength is limited by the compressive strength of the material used. This works well with another common building material – stone - so the technology allows much more impressive and long-lasting results, for example, the Alcantara Bridge in Spain.


The techniques of metaautomation did not originate with me, but in defining the term and establishing a meme for the pattern, I hope to make such techniques more understandable and easy to communicate, easier to cost and express benefits for the software process, and therefore more common.

The first order of metaautomation will become very commonly used as the value is more widely understood. The second order of metaautomation is good for large, distributed and long-lived projects, or where data has high impact e.g. health care or aviation systems.