TY - GEN
T1 - Improving the Accuracy of Vulnerability Report Classification Using Term Frequency-Inverse Gravity Moment
AU - Kudjo, Patrick Kwaku
AU - Chen, Jinfu
AU - Zhou, Minmin
AU - Mensah, Solomon
AU - Huang, Rubing
N1 - Publisher Copyright:
© 2019 IEEE.
PY - 2019/7
Y1 - 2019/7
N2 - Software vulnerability analysis is one of the critical issues in the software industry, and vulnerability classification plays a major role in this analysis. A typical vulnerability classification model usually involves a stage of term selection, in which the relevant terms are identified via feature selection. It also involves a stage of term weighting, in which document weights for the selected terms are computed, and a stage for classifier learning. Generally, the term frequency-inverse document frequency (TF-IDF) is the most widely used term-weighting method. However, empirical evidence shows that the TF-IDF is plagued with issues pertaining to its effectiveness. This paper introduces a new approach for vulnerability classification, which is based on term frequency and inverse gravity moment (TF-IGM). The proposed method is validated by empirical experiments using three machine learning algorithms on ten publicly available vulnerability datasets. The result shows that TF-IGM outperforms the benchmark method across the applications studied.
AB - Software vulnerability analysis is one of the critical issues in the software industry, and vulnerability classification plays a major role in this analysis. A typical vulnerability classification model usually involves a stage of term selection, in which the relevant terms are identified via feature selection. It also involves a stage of term weighting, in which document weights for the selected terms are computed, and a stage for classifier learning. Generally, the term frequency-inverse document frequency (TF-IDF) is the most widely used term-weighting method. However, empirical evidence shows that the TF-IDF is plagued with issues pertaining to its effectiveness. This paper introduces a new approach for vulnerability classification, which is based on term frequency and inverse gravity moment (TF-IGM). The proposed method is validated by empirical experiments using three machine learning algorithms on ten publicly available vulnerability datasets. The result shows that TF-IGM outperforms the benchmark method across the applications studied.
KW - Software vulnerability, Classification, Term weighting, Text Mining
UR - http://www.scopus.com/inward/record.url?scp=85073790433&partnerID=8YFLogxK
U2 - 10.1109/QRS.2019.00041
DO - 10.1109/QRS.2019.00041
M3 - Conference contribution
AN - SCOPUS:85073790433
T3 - Proceedings - 19th IEEE International Conference on Software Quality, Reliability and Security, QRS 2019
SP - 248
EP - 259
BT - Proceedings - 19th IEEE International Conference on Software Quality, Reliability and Security, QRS 2019
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 19th IEEE International Conference on Software Quality, Reliability and Security, QRS 2019
Y2 - 22 July 2019 through 26 July 2019
ER -