From 544a3f56f3f0b374899841bc2768ba0dd2dd7f9d Mon Sep 17 00:00:00 2001
From: Thomas Walker Lynch This guide provides a general overview of testing concepts. It is
- not a reference manual for the Mosaic test bench itself. At the
+ not a reference manual for the Mosaic Testbench itself. At the
time of writing, no such reference document exists, so developers and
testers are advised to consult the source code directly for implementation
details. A small example can be found in the A typical testing setup comprises three main components:
- the test bench, the test
+ the Testbench, the test
routines, and a collection of units under
test (UUTs). Here, a UUT is any individual software or hardware
component intended for testing. Because this guide focuses on software, we
@@ -108,12 +108,12 @@
outputs, and determines whether the test passes or fails based on those
values. A given test routine might repeat this procedure for any number
of test cases. The final result from the test
- routine is then relayed to the test bench. Testers and developers write
- the test routines and place them into the test bench.Introduction
Test_MockClass
@@ -95,7 +95,7 @@
that make use of Mosaic.
Mosaic is a test bench. It serves as a structured environment for +
Mosaic is a Testbench. It serves as a structured environment for organizing and executing test routines, and it provides a library of utility - routines for assisting the test writer. When run, the test bench sequences + routines for assisting the test writer. When run, the Testbench sequences through the set of test routines, one by one, providing each test routine with an interface to control and examine standard input and output. Each test routine, depending on its design, might in turn sequence through @@ -234,6 +234,83 @@
The Mosaic tool assists testers in finding failures, but it does not directly help with identifying the underlying fault that led to the failure. Mosaic is a tool for testers. However, these two tasksâfinding failures and locating faultsâare not entirely separate. Knowing where a failure occurs can provide the developer with a good starting point for locating the fault and help narrow down possible causes. Additionally, once a developer claims to have fixed a fault, that claim can be verified through further testing.
+The Moasic Testbench is useful for any type of testing that can be + formulated as test routines testing RUTs. This certainly includes + verification, regression, development, exploratory testing. It will + include the portions of performance, compliance, security, compatibility, + and acceptance testing that fit the model of test routines and RUTs. Only + recently has can it be imagined that the Mosaic TestBench can be used with + documentation testing. However, it is now possible to fit an AI API into a + test routine, and turn a document into a RUT. Usability testing often + depends in other types of tests, so to this extent the Mosaic Testbench + can play a role. However, usability is often also in part feedback from + users. So short of putting users in the Matrix, this portion of usability + testing remains outside the domain of the Mosaic Testbench, though come to + think of it, the Mosaic Testbench could be used to reduce surveys to pass + fails.
+ +Each test objective will lead to writing tests of a different nature.
+In spot checking, the function under test is checked against one or - two input vectors.
+In spot checking, the function under test is checked against one or two + input vectors. When using a black box approach, these are chosen at + random.
Moving from zero to one is an finite relative change, i.e., running a
program for the first time requires that many moving parts work together,
@@ -311,22 +389,6 @@
test is called a
There are notorious edge cases in software. Zeros and index values just - off the end of arrays come to mind. Checking a middle value and edge cases - is often an effective approach for finding failures.
- -It takes two points to determine a line. In Fourier analysis, it takes - two samples per period of the highest frequency component to determine an - entire waveform. Code also has patterns, patterns that are disjoint at - edge cases. Hence if a piece of code runs without failures for both edge - cases and spot check values in between, it will often run without - failures over an entire domain of values. This effect explains why ad hoc - testing has lead to so much relatively fail free code.
- -Spot checking is especially valuable in early development, as it provides - useful insights with minimal investment. At this stage, investing more is - unwise while the code is still in flux.
-A test routine will potentially run multiple test cases against a given @@ -401,6 +463,10 @@
Structured testing is a form of white box testing, where the tester + examines the code being tested and applies various techniques to it + to increase the efficiency of the testing.
+All types of black-box testing have a serious problem in that the search @@ -548,17 +614,24 @@ -
- A typical response from people when they see this is that the knew it went up - fast, but did not know it went up this fast. -
+A typical response from people when they see this is that the knew it went up + fast, but did not know it went up this fast. It is also important to note, there + is a one to one relationship between percentage of time to achieving exhaustive + coverage, and percentage of coverage. Half the time, 50 percent coverage. In + the last row of the table, to have reasonable test times, there would be coverage + 10-18 percentage coverage. At that level of coverage there is really + no reason to test. Hence, this table is not limited to speaking about exhaustive + testing, rather is speaks to black box testing in general.
+ +In white box testing, we take the opposite approach to black box + testing. The test writer does look at the code implementation and + must understand how to read the code. Take our 64-bit adder example of + the prior section. Here in this section we will apply a white box + technique known as Informed Spot Checking.
-White box testing is the simplest type of structured test. In white box - testing, we take the opposite approach to black box testing. Here, the - test writer does look at the code implementation and must understand how to - read the code. Take our 64-bit adder example. This is it as a black box:
+This is the prior example as a black box:
int64 sum(int64 a, int64 b){
@@ -575,11 +648,13 @@
}
- The tester examines the code and sees there is a special case for a = 5717710
- and b = 27, which becomes the first test case. Thereâs also a special case
- for when the sum exceeds the 64-bit integer range, both in the positive and negative
- directions; these become two more test cases. Finally, the tester includes a few
- additional cases that are not edge cases.
When following the approach of Informed Spot Checking, the tester examines
+ the code and sees there is a special case for a = 5717710
+ and b = 27, which becomes the first test case. Thereâs also
+ a special case for when the sum exceeds the 64-bit integer range, both in
+ the positive and negative directions; these become two more test
+ cases. Finally, the tester includes a few additional cases that are not
+ edge cases.
Thus, by using white box testing instead of black box testing, the tester finds all the failures with just 4 or so test cases instead of
@@ -588,91 +663,372 @@cases. Quite a savings, eh?
+There are notorious edge cases in software, and these can often be seen + by looking at the RUT. Zeros and inputs that lead to index values just off + the end of arrays come to mind are common ones. Checking a middle value + and edge cases is often an effective approach for finding failures.
+ +There is an underlying mechanism at play here. Note that it takes two + points to determine a line. In Fourier analysis, it takes two samples per + period of the highest frequency component to determine an entire + waveform. Code also has patterns, patterns that are disjoint at edge + cases. Hence if a piece of code runs without failures for both edge cases + and spot check values in between, it will often run without failures over + an entire domain of values. This effect explains why ad hoc testing has + lead to so much relatively fail free code.
+ +Informed Spot Checking is especially valuable in early development, as it + provides useful insights with minimal investment. In the early development + stage, making more investment in test code is unwise due to the code being + in flux. Test work is likely to get ripped up and replaced.
+ +The idea of test work being ripped up and replaced highlights a drawback + of white box testing. Analysis of code can become stale when implementations + are changed. However, due to the explosion in the size of the input space + with even a modest number of inputs, white box testing is necessary if there + is to be much commitment to producing reliable software or hardware.
+ +Refactoring a RUT to make it more testable can be a powerful method for + turning testing problems that are exponentially hard due to state + variables, or very difficult to debug due to random variables, into + problems that are linearly hard. According to this method, the + tester is encouraged to examine the RUT to make the testing problem + easier.
+ +By reconstructing the RUT I mean that we refactor the code to bring + any random variables or state variables to the interface where they + are then treated as inputs and outputs.
+ +If placing state variables on the interface is adopted as a discipline by + the developers, reconstruction will not be needed in the test phase, or if + it is needed, white box testers will see this, and it will be a bug that + has been caught. Otherwise reconstruction leads to two versions of a + routine, one that has been reconstructed, and the other that has not. The + leverage gained on the testing problem by reconstructing a routine + typically more than outweighs the extra verification problem of comparing + the before and after routines.
+ +As an example, consider our adder function with a random fault. As we + know from prior analysis, changing the fault to a random number makes + testing harder, but perhaps more importantly, it makes it nearly impossible + to debug, as the tester can not hand it to the developer and say, + 'it fails in this case'.
+
+ int64 sum(int64 a, int64 b){
+ if( a == (5717710 * rand()) && b == (27 * rand()) ) return 5;
+ else return a + b;
+ }
+
+ The tester refactors this function as:
+
+ int64 sum( int64 a, int64 b, a0 = 5717710*rand() ,b0 = 27*rand() ){
+ if( a == a0 && b == b0 ) return 5;
+ else return a + b;
+ }
+
+
+ Here a0 and b0 are added to the interface as
+ optional arguments. During testing their values will be supplied, during
+ production the defaults will be used. Thus, we have broken the one
+ test problem into two, the question if sum works, and the
+ question if the random number generation works.
+ +
Failures in sum found during testing are now reproducible.
+ If the tester employs the informed spot checking the failure will
+ be found with few tests, and the point in the input space where the
+ failure occurs can be reported to development and used for debugging.
Here is a function that keeps a state variable between calls.
+
+ int state = 0;
+ int call_count = 0;
+ void state_machine(int input) {
+ int choice = (input >> call_count) & 1;
+ switch (state) {
+ case 0:
+ printf("State 0: Initializing...\n");
+ state = choice ? 0 : 1;
+ break;
+ case 1:
+ printf("State 1: Processing Path A...\n");
+ state = choice ? 0 : 2;
+ break;
+ case 2:
+ printf("State 2: Processing Path B...\n");
+ state = choice ? 0 : 3;
+ break;
+ }
+ call_count++;
+ }
+
-
-