Кластеризации данных в области прикладной информатики и программной инженерии на примере зарубежного опыта и зарубежных публикаций

жүктеу/скачать 177,5 Kb.

бет	8/17
Дата	15.12.2022
өлшемі	177,5 Kb.
	#57493

1 ... 4 5 6 7 8 9 10 11 ... 17

Байланысты:
Лаб 1 Даурен

E n h a n c i n g I n f o r m a t i o n
References

Table 4. Error rates of the 25 classifiers for data sets JM1-8850 and KC2-520. An asterisk indicates that a cross-validation approach wasn’t available for that technique.

JM1-8850 False-positive Fal				KC2-520 se-negative False-positive False-negative
Classifier	Symbol	rate (%)	rate (%)		rate (%)	rate (%)
Case-based reasoning¹	CBR	30.70	30.88		21.50	20.75
*Treedisc decision tree algorithm	TD	30.78	29.16		18.60	16.04
*Logistic regression	LR	34.23	33.97		20.77	21.70
*Lines of code	LOC	34.85	34.08		20.53	19.81
*Genetic programming	GP	34.71	32.66		18.36	16.98
*Artificial neural networks	ANN	38.06	30.35		21.26	21.70
LogitBoost classifier	LBOOST	34.72	32.72		22.22	20.75
Rule-based modeling²	RBM	33.71	33.08		17.39	16.04
Bagging classifier	BAG	30.59	30.76		21.50	20.75
*Rough sets-based classifier	RSET	31.62	30.94		16.18	14.15
MetaCost classifier	MCOST	33.67	33.61		23.43	21.70
AdaBoost classifier	ABOOST	33.41	33.79		28.26	29.25
Decision table	DTABLE	34.29	34.32		18.84	18.87
Alternating decision table	ADT	33.83	33.61		19.81	19.81
Sequential minimal optimization	SMO	34.09	33.97		20.77	20.75
Instance-based (1 nearest neighbor)	IB1	34.73	34.74		23.67	24.53
Instance-based (k nearest neighbor)	IBK	32.70	32.48		20.53	19.81
Partial decision trees	PART	33.16	33.14		20.77	19.81
OneR algorithm (based on one most informative attribute)	ONER	34.50	34.38		20.05	19.81
Repeated incremental pruning to produce error reduction	JRIP	33.18	33.08		19.81	19.81
Ripple down rule algorithm	RDR	33.94	34.02		18.84	19.81
C4.5 decision tree algorithm	J48	32.56	32.42		19.57	19.81
Naive Bayes	NBAYES	34.12	33.97		21.26	21.70
Hyperpipes algorithm	HPIPES	37.97	38.29		23.91	23.58
Locally weighted learning	LWLS	33.59	33.61		20.05	19.81

until we got the desired balance between the FPR and FNR. Software-engineering prac- titioners often use the LOC metric as a rule of thumb to gauge a software product’s qual- ity. The underlying assumption is, a larger- size program module is likely to have more software faults than a smaller module.
Table 4 shows the FPR and FNR rates of the 25 classifiers for the two case studies. For 19 of 25 classifiers, the error rates are based on a tenfold cross-validation approach. For the six classification techniques marked with an asterisk in the table, a cross-validation feature wasn’t available owing to limitation of the respective tools used. Table 5 presents some descriptive statistics of the error rates for the 25 classification models. The error rates of the selected models for JM-8850 are relatively greater than for KC2-520. In addition, the FPR and FNR error rates for a given classifier are similar, reflecting the effect of our model-selection strategy.

Noise detection results
We compare the modules tagged as noise by the ensemble noise filter approach with those mislabeled by the software-engineering expert in the clustering-based method. The results in Figure 3 show an interesting match between the two sets of modules. The x-axis indicates the consensus level among the 25
Table 5. Descriptive statistics of error rates for the 25 classification models.

Statistic Average	FPR (%) 33.75	FNR (%) 33.12	FPR (%) 20.71	FNR (%) 20.30
Standard deviation	1.80	1.80	2.41	2.93
Median	33.83	33.61	20.53	19.81
Minimum	30.59	29.16	16.18	14.15
Maximum	38.06	38.29	28.26	29.25

E n h a n c i n g I n f o r m a t i o n

Figure 3. Noise recall results for (a) JM1-8850 and (b) KC2-520

classifiers used for noise filtering. For exam- ple, 13 means that a module is viewed as noise if 13 or more classifiers predict the label wrong. The y-axis shows the recall per- centage of the modules considered noise by the ensemble—that is, how many of them are covered by the set of modules the expert mislabeled.

In JM1-8850, the noise recall performance of expert-based classification with clusters obtained by Neural-Gas was generally bet- ter than with clusters based on k-means. In KC2-520, however, the opposite was true. The absolute noise recall performance of the expert-based classification was generally bet- ter for the KC2-520 data set than for the JM1- 8850 data set. This indicates that data char- acteristics such as the extent of potential noise, among other factors, influence a clas- sifier’s performance.
It’s interesting that a majority of the mod- ules detected as noise were among the mod- ules mislabeled by the clustering- and expert- based labeling method. Even though we don’t yet know which one (the noise-filtering method or the clustering method) was more accurate for this case study, the matching results warrant future research on noise filter- ing with unsupervised clustering techniques.

T
his study reflects our initial research into clustering-based analysis for soft-
ware quality-estimation problems. We plan to continue discussions with software engineers to better evaluate the benefits of clustering-based analysis. For this purpose, we must further interpret the quality-estimation and noise detection results.

It’s possible to build a more interactive system for software engineers to explore software metrics data, identify mislabeled software modules, and pinpoint the deficiency and inappropriateness of collected software metrics. Data analysts and software- engineering experts can collaborate more closely to construct and collect more informative software metrics.
Data analysts can apply a clustering- and expert-based classification scheme to classification problems in other domains, such as medical research and computer-network- intrusion detection. In the future, they could consider additional clustering techniques and compare them with the techniques in this
study. The impact of numbers of clusters on classification accuracy also deserves more investigation.
Acknowledgments
We thank the anonymous reviewers for their constructive critique and suggestions, Vedang Joshi and Pierre Rebours for their assistance with experiments, and Kehan Gao for her patient reviews of the manuscript.

References

T.M. Khoshgoftaar and N. Seliya, “Analogy- Based Practical Classification Rules for Soft- ware Quality Estimation,” Empirical Software Eng. J., vol. 8, no. 4, Dec. 2003, pp. 325–350.

T.M. Khoshgoftaar, L.A. Bullard, and K. Gao, “Detecting Outliers Using Rule-Based Mod- eling for Improving CBR-Based Software Quality Classification Models,” Case-Based Reasoning Research and Development, LNAI 1689, Springer-Verlag, 2003, pp. 216–230.

C.E. Brodley and M.A. Friedl, “Identifying Mislabeled Training Data,” J. Artificial Intel- ligence Research, vol. 11, Jul.–Dec. 1999, pp. 131–167.

C.M. Teng, “Correcting Noisy Data,” Proc. 16th Int’l Conf. Machine Learning (ICML 99), Morgan Kaufmann, 1999, pp. 239–248.

S. Zhong and J. Ghosh, “A Unified Frame- work for Model-Based Clustering,” J. Machine Learning Research, vol. 4, Dec. 2003, pp. 1001–1037.
T.M. Martinetz, S.G. Berkovich, and K.J. Schulten, “Neural-Gas Network for Vector Quantization and its Application to Time- Series Prediction,” IEEE Trans. Neural Net- works, vol. 4, no. 4, July 1993, pp. 558–569.

W. Pedrycz et al., “Self-Organizing Maps as a Tool for Software Analysis,” Proc. IEEE Canadian Conf. Electrical and Computer Eng. 2001 (CCECE 2001), IEEE Press, 2001, pp. 93–97.

L.C. Briand, W.L. Melo, and J. Wust, “Assess- ing the Applicability of fault proneness Mod- els across Object-Oriented Software Pro- jects,” IEEE Trans. Software Eng., vol. 28, no. 7, July 2002, pp. 706–720.

V. Joshi, Noise Elimination with Ensemble- Classifier Filtering: A Case Study in Software Quality Eng., master’s thesis, Dept. of Com- puter Science and Eng., Florida Atlantic Univ., 2003.

I.H. Witten and E. Frank, Data Mining: Prac- tical Machine Learning Tools and Techniques with Java Implementations, Morgan Kauf- mann, 1999.

Для оценки качества программного обеспечения специалисты по разработке программного обеспечения обычно строят модели классификации качества или прогнозирования неисправностей, используя метрики программного обеспечения и

данные о неисправностях из предыдущего выпуска системы или аналогичного программного проекта.
Затем инженеры используют эти модели для прогнозирования вероятности возникновения неисправностей в разрабатываемых программных модулях.
Однако создание точных моделей оценки качества является сложной задачей, поскольку зашумленные данные обычно снижают эффективность обученных моделей. Существует два общих типа шума в метриках и данных о качестве программного обеспечения. Один относится к неправильно маркированным программным модулям, вызванным тем, что инженеры-программисты не смогли обнаружить, не сообщили или просто проигнорировали существующие программные ошибки. Другая связана с недостатками некоторых коли- лектируемых метрик программного обеспечения, которые могут привести к тому, что два одинаковых (с точки зрения заданных метрик) программных модуля будут иметь разные метки неисправности. Удаление таких зашумленных экземпляров может значительно улучшить производительность калиброванных моделей оценки качества ПО. Поэтому желательно точно определить проблемные программные модули до калибровки любых моделей оценки качества программного обеспечения.
Другая основная проблема заключается в том, что в реальных проектах программного обеспечения измерения неисправностей программного обеспечения (например, метки выраженности неисправностей) могут быть недоступны для обучения модели оценки качества программного обеспечения. Это происходит, когда организация имеет дело с типом проекта программного обеспечения, с которым она никогда раньше не имела дела. Кроме того, возможно, она не записывала и не собирала данные о неисправностях программного обеспечения в предыдущем выпуске системы. Итак, как команда по обеспечению качества может предсказать качество программного проекта без собранных метрик программного обеспечения? Команда не может использовать метод контролируемого обучения без таких показателей качества программного обеспечения, как класс риска или количество неисправностей. Тогда задача оценки ложится на аналитика (эксперта), который должен определить метки для каждого программного модуля. Кластерный анализ, инструмент анализа данных, естественным образом решает эти две проблемы.

жүктеу/скачать 177,5 Kb.

Достарыңызбен бөлісу:

1 ... 4 5 6 7 8 9 10 11 ... 17