Because the concept of system responsibility is so foundational to how I develop and test software, I want to expand on my earlier description. Recall that I defined a system responsibility as a system’s obligation to respond to each notification of a specified kind of event under specified circumstances by producing a specified set of planned results.
A system responsibility includes three parts:
A stimulus that triggers the system to respond to an event.
A context in which the system is required to respond to the stimulus.
A set of results that the system is obligated to realize in response to that stimulus in that context.
Stimulus.A stimulus is a message, sent by someone or something outside the boundary of the system, that informs the system of an event to which it is obligated to respond. The stimulus has a name, which may identify either the event that it represents or the planned response that the system must carry out. The stimulus may include additional information about the event.
Stimuli are delivered to a system through its interfaces. An interface defines a set of messages to which a system responds, and the mechanisms by which those messages are delivered. For GUI systems, the interface includes a suite of windows, forms, buttons, text fields, and other mechanisms that translate user gestures (mouse clicks, key presses) into messages. Web-based systems receive stimuli through HTTP requests and other interfaces. Smaller scale systems, such as objects inside a software application, expose Application Programming Interfaces (APIs) that define the set of methods to which internal objects and subsystems respond.
Result.A result is an effect that the system realizes in response to a specified stimulus in a specified context. A result may be either a message delivered to someone or something outside the boundary of the system or a change in the system’s internal state.
GUI systems deliver messages through forms, windows, screens, audio devices, and other output devices. Web-based systems deliver messages through HTTP responses and requests. An application’s internal objects and subsystems deliver messages through method calls and method return values.
In addition to delivering messages to external entities, systems also respond to events by recording information internally, and by making changes to that internal information. The information may be stored inside the running application, in a database, in files on the computer’s file system, or other storage mechanisms. The information that a system stores in order to guide its responses to future events makes up the system state.
Context. Sometimes a system’s planned response depends not just on information delivered through the stimulus, but other information as well. The context for a given responsibility is all of the information other than that delivered in the stimulus that influences the results that the system is obligated to realize in response to an event. The context may include information about the state of the system itself–that is, information that the system previously recorded in its internal memory about prior events. The context may also include information that the system can observe across its boundary–information that the system must request from external entities in order to fulfill the responsibility.
Comments
I first learned about the idea of planned response systems from III, a colleague and friend of mine. I later read about the idea in depth in McMenamin and Palmer’s profound book Essential Systems Analysis.
The idea of planned response systems is fundamental to how I think about programming and testing. I’m posting my thoughts here so that I can refer to these terms and ideas in later blog posts. Until I write those posts, I encourage you to notice what happens when you think about software systems as planned response systems.
A planned response system is a system that responds in planned ways to events in its environment.
For example, a software system is a planned response system—it responds in planned ways to users’ actions.
In an object-oriented software systems, each object is a planned response system—it responds in planned ways to messages sent by other objects.
Planned response systems produce two general kinds of results: They send messages to entities outside of the system boundary, and they make changes to the essential memory of the system.
An event is a significant change in the system’s environment. A change is significant to the system if the system is obligated to respond to the change in a planned way.
Events fall into two broad categories: Changes initiated by entities in the system’s environment (e.g. users or other systems), and temporal events caused by the passage of of time.
For example, an ATM is obligated to respond in a planned way to a user’s request to withdraw cash. The user’s request is an event.
A system responsibility is a system’s obligation to respond to each notification of a specified kind of event under specified circumstances by producing a specified set of planned results.
The specification of a system responsibility consists of three parts: A specification of a kind of event, a specification of a set of circumstances, and a specification of the set of planned results that the system is obligated to produce in response to being notified of an event of that kind under those circumstances.
A system becomes obligated to respond to an event when a system designer allocates that responsibility to the system.
The essence of a planned response system is the set of responsibilities allocated to the system, independent of the choice of technology used to implement the system.
The definition a system’s essence makes no mention whatever of technology inside the system, because the system’s essential responsibilities would be the same whether it were implemented using software, magical fairies, a horde of trained monkeys, or my brothers Glenn and Gregg wielding pencils and stacks of index cards.
One way to identify the essence of a system is to indulge in The Fantasy of Perfect Technology. Imagine a system implemented using perfect technology. Then ask yourself some questions about the quality attributes of the system.
How fast would it respond? If it were made of perfect technology, of course it would respond instantly, with zero delay. How many users could use it at once? An infinite number of users. How much information could it store? An infinite amount. How often would it break? It would never break. How long does it take to start up? None, because it’s always on and always available. How much energy would it use? It would use no energy; heck, it might even generate energy for free.
The one glaring flaw of perfect technology is that it does not exist. Real-world technology is imperfect. That’s what makes this exercise a fantasy. But it’s a useful fantasy, because it helps us to separate the system’s essential responsibilities from the temporary constraints of current technology.
Note that we apply the Fantasy of Perfect Technology only inside the boundary of the system. Even in our fantasy, the world outside of the system is made of real, imperfect stuff, with which the system will have to interact.
Now apply the fantasy to your own system. What responsibilities would your system have even if you could implement it using perfect technology? That set of responsibilities is your system’s essence.
The essential memory of a system is the set of data that the system must remember in order to fulfill its obligations—that is, in order to respond as planned to future events.
For example, an ATM must remember users’ account balances in order to determine whether to satisfy users’ requests to withdraw money.
A few days ago I was poking around the web for ideas about how to test software, and I saw Scott Ambler’s article about “Full Life Cycle Object-Oriented Testing (FLOOT).” The article includes a list of common testing techniques. As I looked over the list, I noticed that there is a small set of key dimensions that distinguish one testing technique from another. For example, unit testing and system testing differ in the kind of component they test. Stress testing and usability testing differ in the quality attribute that they test for. Unit testing and acceptance testing differ in the nature of the decisions that are made based on the test results.
I love looking for patterns like that, so I spent an hour analyzing Scott’s to identify the dimensions. Here are thirteen dimensions I found, and a few examples that show how different testing techniques vary along each.
Unit Under Test. What type of component being tested?
In Class Testing or Unit Testing, the unit under test is a class.
In Method Testing, the unit under test is a method of a class.
In System Testing, the unit under test is the system.
Test Case Scope. What is the scope of the interaction tested by each test case?
In Use-Case Scenario Testing, the scope of the interaction tested by each test case is a user goal.
In Unit Testing, the scope of each test case is a method invocation.
In Integration Testing, the scope is a transaction.
Unit Coverage. What subset of the unit under test is exercised by the test suite?
In Coverage Testing, the subset being exercised by the test suite is code statements.
In Path Testing, the coverage is logic paths.
In Regression Testing, the coverage is code changes.
In Boundary-value Testing, the coverage is limits.
Behavioral Scope. What subset of the unit-under-test’s behavior is being tested?
Installation Testing tests the system’s installation procedure.
Functional Testing tests the system’s business functionality.
Integration Testing tests interactions among subsystems.
Unit Relationships. What are the relationships among the units whose interactions are being tested?
In Inheritance-regression Testing, the relationship between units is inheritance.
In Integration Testing, the relationship is collaboration or peers.
Quality Attribute. What type of quality attribute is being tested?
In Stress Testing or Volume Testing, the quality attribute being tested is throughput or latency or capacity.
In Usability Testing, the quality attribute being tested is usability.
Stakeholder. Whose interests are the focus of the testing?
Acceptance Testing focuses on the interests of users.
Operations Testing focuses on the interests of operators.
Support Testing focuses on the interests of support staff.
Liveness. How closely does the test environment mimic the operational environment. Or perhaps this dimension is better characterized as Safety: To what extent are the testers using the system to do the real work for which the system was intended?
In a Pilot, the test is the actual operational environment, perhaps limited in scope (e.g. a small subset of users, or for a limited time).
In Beta Testing, the environment is a fully operation environment, but perhaps used only for non-critical functions.
In Acceptance Testing, the environment is a non-operational similar to the operational environment.
Unit Testing is done in the development environment.
Visibility into Unit Under Test. To what extent does the tester exploit knowledge about the internals of the unit under test?
In Black-box Testing, the tester exploits no knowledge knowledge of internals of the unit under test.
In White-box Testing, the tester exploits full knowledge of internals.
In Grey-box Testing, the tester exploits some knowledge of internals.
Tester. What is the relationship of the tester to the software under test?
For Acceptance Testing or User Testing, the tester is a user of the software.
For Unit Testing or Developer Testing, the tester is a developer of the software.
Processor. What type of “processor” will “executes” the “software” during the tests?
In most kinds of testing, a computer executes the software.
In Code Inspections and Design Reviews, developers “execute” the software.
In Prototype Walkthroughs, user “execute” the “software.”
Pre-Test Confidence. How confident are we about the software before we begin the testing?
Before Alpha Testing, our confidence in the software is lower (compared with Beta Testing).
Before Beta Testing, our confidence in the software is higher (compared with Alpha Testing).
Decision Scope. What kinds of decisions will we make based on the outcome of the test?
For Acceptance Testing, the key decision is shell to release the product.
For Integration Testing, the decision may be whether to begin system testing.
For Unit Testing, the decision is whether the current coding task is complete.
This list is based on only an hour’s work, and on my analysis of only a single list of testing techniques (Scott’s), so I don’t claim that it is anywhere near complete or correct. It might be useful, though, for people who want to expand their repertoire of testing techniques, or to locate a technique that fits a given purpose or context.
I wonder what would happen if we created a thirteen-dimensional matrix. What parts would of the matrix would be crowded with testing techniques? What parts would be empty?
Thirteen dimensions is more than I can handle. So what would happen if we took two or three dimensions at a time and explored all of the values along those dimensions? Would that be interesting? Would it be useful? Would it help us to identify testing techniques that fit our specific situations? Might we notice holes in the matrix for which we want to invent useful techniques?