Over the past few years, many techniques have been proposed to gather topics from documents, among which Latent Dirichlet Allocation.
These authors propose to use LDA for traceability link recovery, feature location, and software artifact labeling. However, the standard version of LDA does not work so well on source code. Luckily, there is good news, as source code is different (better) structures than natural language, and we can exploit this.
Therefor the authors present LDA-GA: a tool that uses a Genetic Algorithm (GA) to determine the near-optimal conﬁguration for LDA. By calibrating LDA on a specific dataset, its success can be increased significantly.