Review of algorithms for the detection of outliers

Authors

  • Cristina Mariuxi Flores Urgiles Universidad Católica de Cuenca, Ecuador
  • Martin Sebastián Ortiz Amoroso Universidad Católica de Cuenca, Ecuador

DOI:

https://doi.org/10.26871/killkana_tecnica.v2i1.287

Abstract

The detection of outliers is an extremely important task in a wide variety of application domains.  Often these values are eliminated to improve the accuracy of the information, but sometimes the presence of an outlier has a certain sense or explanation that can be lost if it can be eliminated, that its identification can lead to the discovery of unexpected knowledge.  Various areas such as:   criminal activities in electronic commerce, fraud detection and even statistical performance analysis.  The article presented is the result of a non-exhaustive documentary investigation of the opinion of several authors, who focused their work to determine the efficiency of the methods or algorithms for the detection of outliers.  Initially, a theoretical conceptual study was carried out to understand the nature of an atypical value and its classification, and then perform an analysis on the different techniques in the determination of clusters, distances and density. For each one of the techniques of detection of atypical values, it was found that the algorithms that have been created by different authors besides the efficiency that each of them has in certain contexts.

Downloads

Download data is not yet available.

References

J. Han y M. Kamber, «Data Mining: Concepts and Techniques».

D. Hawkins, Identification of Outliers., London: Chapman & Hall, 1980.

R. a. D. J. H. Bolton, «Statistical Fraud Detection: A Review,» Statistical Science, pp. pp. 235-249, 2002.

T. a. C. E. B. Lane, «Temporal Sequence Learning and Data Reduction for Anomaly Detection,» ACM Transactions on Information and System Security, pp. Pages 295-331 , 2000.

A. a. A. F. Chiu, «Enhancement on Local Outlier Detection.,» Chiu, A. an 7th International Database Engineering and Application Symposium (IDEAS03), pp. pp. 298-307., 2003.

E. a. R. N. Knorr, « Algorithms for Mining Distance-based Outliers in Large Data Sets,» Proc. the 24 th International Conference on Very Large Databases (VLDB), pp. pp. 392-403., 2000.

A. L. T. a. C. S. Loureiro, «Outlier Detection using Clustering Methods: a Data Cleaning Application,» in Proceedings of KDNet Symposium on Knowledge-based Systems for the Public Sector, Bonn, Germany..

K. C. H. S. Z. a. J. C. Niu, «ODDC: Outlier Detection Using Distance Distribution Clustering,» PAKDD 2007 Workshops, Lecture Notes in Artificial Intelligence (LNAI) 4819, Springer-Verlag., p. pp. 332–343, 2007.

J. a. H. W. Zhang, «Detecting outlying subspaces for high-dimensional data: the new Task, Algorithms, and Performance,» Knowledge and Information Systems,, 2006.

V. a. T. L. Barnett, «Outliers in Statistical Data,» John Wiley., 1994.

P. a. A. L. Rousseeuw, Robust Regression and Outlier Detection, John Wiley & Sons., 2000.

E. R. N. a. V. T. Knorr, « Distance-based Outliers: Algorithms and Applications.,» VLDB Journal, pp. 237-253., 2000..

S. R. R. a. K. S. Ramaswami, «Efficient Algorithm for Mining Outliers from Large Data Sets,» Proc. ACM SIGMOD, pp. pp. 427-438., 2000.

F. a. C. P. Angiulli, «Outlier Mining in Large High-Dimensional Data Sets,» IEEE Transactions on Knowledge and Data Engineering, 17(2), pp. 203-215, 2005.

H. K. R. N. a. J. S. M., «Lof: identifying density-based local outliers,» In Proceedings of 2000 ACM SIGMOD International Conference on Management of Data., pp. 93-104, 2000.

H. K. P. G. a. C. F. S., « Fast outlier detection using the local correlation integral.,» Proc. of the International Conference on Data Engineering, pp. pp. 315-326., 2003.

J. L. B. A. P. a. S. F. Almeida, «Improving Hierarchical Cluster Analysis: A New Method with Outlier Detection and Automatic Clustering,» Chemometrics and Intelligent Laboratory Systems, p. 208–217, 2007.

C. C. Y. S. P. Aggarwal, «An effective and efficient algorithm for high-dimensional outlier detection,» The VLDB Journal, vol. 14, p. 211–22, 2005.

V. a. L. T. Barnett, Outliers in Statistical Data., John Wiley., 2000.

A. E. a. R. C., «A Meta Analysis Study of Outlier Detection Methods in Classification, Technical paper, Department of Mathematics, University of Puerto Rico at Mayaguez,» 2004.

J. R. K. R. Gnanadesikan, «Robust Estimates Residuals and Outlier Detection with Multiresponse Data,» Biometrics., vol. 28, pp. pp 81-124.

B. B. J. Peat, «Medical Statistics: “A guide to data analysis and critical appraisal”,» Blackwell Publishing , 2005.

I. Ben-Gal, «Outlier detection,» Data Mining and Knowledge Discovery Handbook: A Complete Guide for Practitioners and Researchers, pp. 131-146, 2005.

A. a. R. D. Jain, Algorithms for Clustering Dat, Prentice-Hall., 1988..

M. K. P. a. J. B. Laan, «A New Partitioning Around Medoids Algorithms,» Journal of Statistical Computation and Simulation, 2003.

S. a. S. Bay, Mining distance-based outliers in near linear time with randomization and a simple pruning rule, 2003.

M. H. K. R. N. a. J. S. Breunig, «identifying density-based local outliers,» Proceedings of 2000 ACM SIGMOD International Conference on Management of Data, p. 93–104., 2000.

S. H. K. P. G. a. C. F. Papadimitriou, «Fast outlier detection using the local correlation integral.,» Proc. of the International Conference on Data Engineering, pp. pp. 315-326., 2003.

D. X. a. J. F. J. Principe, «Unsupervised Adaptive Filtering,» de Information Theoretic Learning, vol. 1, John Wiley & Sons, 2000.

D. B. ,. S. S. A. Mira, «RODHA: Robust Outlier Detection using Hybrid Approach,» American Journal of Intelligent Systems, 2012.

A. Ankur, «Local Subspace based Outlier Detection. IC3 2009, CCIS 40, pp. 149–157, 2009.,» Contemporary Computing , pp. pp 149-157, 2009.

Published

2018-06-22
ESTADISTICAS
  • Abstract 363
  • PDF (Español (España)) 194
  • HTML (Español (España)) 796
  • EPUB (Español (España)) 76
  • Audio Español (Español (España)) 52

How to Cite

1.
Flores Urgiles CM, Ortiz Amoroso MS. Review of algorithms for the detection of outliers. tecnica [Internet]. 2018 Jun. 22 [cited 2024 Jul. 3];2(1):19-26. Available from: https://killkana.ucacue.edu.ec/index.php/killkana_tecnico/article/view/287

Issue

Section

Artículos original de investigación