The idea of this paper, presented by Martin Monperrus, is to abstract over source code to analyze it more coarse-grained.
With this abstraction, we can determine how diverse a certain API is used. Is it always used in the same way.
The authors have mined 9.022.262 type-usages which refer to 382.774 Java classes. To their surprise, they observed a lot of diversity of API usage. There were 748 classes that were used in more than 100 different ways. The speculate on the reasons for this, like a correlation with reusability. According to the authors, “[diversity] can reﬂect the fact that client code was able to use the class in ways that were unanticipated by the class designer.”
The paper ends with a lot of open questions on the impact of this finding, for instance
- Should we support or encourage the diversity in object-oriented software?
- How to ensure that all possible type-usages are correct? Should there be one test per observed API usage (this would mean 2.460 test cases for Java’s String)
Interesting topic, I expect(and hope) that in future SCAMs we will see papers which address these questions. Pre-print is here.