Existing spreadsheet testing approaches assume that users can define test cases, or correct input values. But, this can be challenging for users.
Thomas’s idea is to split a given spreadsheet in parts to make it easier for users to test parts. Therefore, they have to be easy to understand and loosely coupled. They start with an automatic fragmentation and the user can then give feedback and change fragments if he wants too.
A few key ideas:
- Collapse copy-equivalent structures
- For all copy-equivalent fragments only 1 test case needs to be created, and it can then be reused for all copies.
- But, copy-equivalent regions do not need to be semantically equivalent (a SUM of the above 3 rows can mean different things)
- Complexity of fragments needs to be low, so they are easy to understand.
The basic algorithm starts with single-cell fragments and then groups them. But then the question is, what is the optimal fragmentation, when do we need to stop grouping. A number of heuristics are currently used, including the number of output cells and the spanned area. If we cells are close the fragment will be easier to understand. The formula smells are also taken into account.
Fragments are visualized one by one, where input, interim and output cells are colored differently:
Minimizing the sum of all the above complexity metrics is an optimization problem, and potentially some other choices could be made. Ultimately Thomas wants to do a user study to better predict cognitive load of fragments.