Abstract:
Code smell is a poor design choice by developers that could compromise software systems' general maintainability, clarity, and complexity. It reflects poor design or implementation decisions in the
source code, which makes it more change and fault-prone. Researchers developed a number of
code smell detectors that use various sources of data to assist developers in discovering design
faults. Despite their high accuracy, earlier research has identified three major drawbacks that could
prevent code smell detectors from being used in practice: (i) Developers' subjective perceptions of code smells discovered by such tools, (ii) little agreement across different detectors, and (iii)
difficulty in determining appropriate detection thresholds. Machine learning techniques are
becoming increasingly popular as a means of overcoming these constraints.
Hence, this study has performed a Systematic Literature Review with the aim of exploring the code smells detected, Machine learning Techniques deployed and datasets used. The Systematic Literature Review was performed on three online databases. Accordingly, Long method, Feature envy, God class and Data class are the most widely studied code smells. While Random Forest, Decision tree, Naïve Bayes and SVM are the most widely used machine learning techniques. Additionally, Qualities corpus, Xerces and other project datasets that are not explicitly mentioned are the most commonly used datasets in the studies from 2017-2020. This research has also performed an experiment (Replication study) using four different machine-learning algorithms (J48, Random Forest, JRIP and Naïve Bayes). These algorithms were applied on two code smells (Long method and Feature envy) that are selected via conducting a Systematic Literature Review. 395 code smell samples were used. The four machine learning algorithms are chosen based on their strong performance in multi-class dataset as determined by mapping study. The results demonstrate that all the adopted algorithms have performed above 90 % accuracy with the exception that Random Forest algorithm shows the highest performance with respect to most
performance metrics given the dataset while the worst performance was achieved by Naïve Bayes.
However, the dataset's lower prevalence of code smell instances and nature of projects resulted in different results that will need to be addressed in future studies. The research concluded that the application of machine learning to the detection of these code smells can provide high accuracy.