ARTICLE
mendeley

A survey of clustering algorithms for big data: Taxonomy and empirical analysis

  • Authors: Vv.Aa.
  • IEEE Transactions on Emerging Topics in Computing
  • 2014
  • DOI: 10.1109/TETC.2014.2330519

Abstract

Clustering algorithms have emerged as an alternative powerful meta-learning tool to accu- rately analyze the massive volume of data generated by modern applications. In particular, their main goal is to categorize data into clusters such that objects are grouped in the same cluster when they are similar according to specific metrics. There is a vast body of knowledge in the area of clustering and there has been attempts to analyze and categorize them for a larger number of applications. However, one of the major issues in using clustering algorithms for big data that causes confusion amongst practitioners is the lack of consensus in the definition of their properties as well as a lack of formal categorization. With the intention of alleviating these problems, this paper introduces concepts and algorithms related to clustering, a concise survey of existing (clustering) algorithms as well as providing a comparison, both from a theoretical and an empirical perspective. From a theoretical perspective, we developed a categorizing framework based on the main properties pointed out in previous studies. Empirically, we conducted extensive experiments where we compared the most representative algorithm from each of the categories using a large number of real (big) data sets. The effectiveness of the candidate clustering algorithms is measured through a number of internal and external validity metrics, stability, runtime, and scalability tests. In addition, we highlighted the set of clustering algorithms that are the best performing for big data.

Articles

Exploring the effectiveness of demand management policy in reducing traffic congestion and environmental pollution: Car-free day and odd-even plate measures for Bandung city in Indonesia

Muhammad Farda; Chandra Balijepalli

Case Studies on Transport Policy . 2018-07-17 sciencedirect

“Project awareness:” Fostering social justice youth development to counter youth experiences of housing instability, trauma and injustice

Ann M. Aviles; Nicole Grigalunas

Children and Youth Services Review . 2018-01-31 sciencedirect

An escalated convergent firefly algorithm

Sankalap Arora; Ranjit Kaur

Journal of King Saud University - Computer and Information Sciences . 2018-10-24 sciencedirect

Parents’ perspectives on using autonomous vehicles to enhance children’s mobility

Yi-Ching Lee; Jessica H. Mirman

Transportation Research Part C: Emerging Technologies . 2018-11-30 sciencedirect

Unbiasedness of estimation-of-distribution algorithms

Tobias Friedrich; Timo Kötzing; Martin S. Krejca

Theoretical Computer Science . 2018-11-22 sciencedirect

Impact of change in neighborhood racial/ethnic segregation on cardiovascular health in minority youth attending a park-based afterschool program

Emily M. D'Agostino; Hersila H. Patel; Zafar Ahmed; Eric Hansen; Sarah E. Messiah

Social Science & Medicine . 2018-05-31 sciencedirect

Identification of weak peaks in X-ray fluorescence spectrum analysis based on the hybrid algorithm combining genetic and Levenberg Marquardt algorithm

Hua Du; Wuhui Chen; Qingjun Zhu; Songlin Liu; Jianbin Zhou

Applied Radiation and Isotopes . 2018-11-30 sciencedirect

Navigating the intergenerational divide? Youth, artisanal diamond mining, and social transformation in Sierra Leone

The Extractive Industries and Society . 2017-11-30 sciencedirect

The networked evolutionary algorithm: A network science perspective

Wenbo Du; Mingyuan Zhang; Wen Ying; Matjaž Perc; Dapeng Wu

Applied Mathematics and Computation . 2018-12-01 sciencedirect

Measures of outdoor play and independent mobility in children and youth: A methodological review

Bree Bates; Michelle R. Stone

Journal of Science and Medicine in Sport . 2015-09-30 sciencedirect

On comparing some algorithms for finding the optimal bandwidth in Geographically Weighted Regression

Alan Ricardo da Silva; Felipe Franco Mendes

Applied Soft Computing . 2018-12-31 sciencedirect

Who knows about kids these days? Analyzing the determinants of youth and adult mobility in the U.S. between 1990 and 2009

Evelyn Blumenberg; Kelcie Ralph; Michael Smart; Brian D. Taylor

Transportation Research Part A: Policy and Practice . 2016-11-30 sciencedirect

A stochastic multiple gradient descent algorithm

Quentin Mercier; Fabrice Poirion; Jean-Antoine Désidéri

European Journal of Operational Research . 2018-12-16 sciencedirect

Views of teenage children about the effects of a Parent's mobility disability

Lisa I. Iezzoni; Amy J. Wint; Alexy Arauz Boudreau; Cheri A. Blauwet; Karen A. Kuhlthau

Disability and Health Journal . 2018-07-31 sciencedirect

Face recognition based on genetic algorithm

Hui Zhi; Sanyang Liu

Journal of Visual Communication and Image Representation . 2018-12-07 sciencedirect

Spaces of hope? Youth perspectives on health and wellness in indigenous communities

Lydia Wood; David Kamper; Kate Swanson

Health & Place . 2018-03-31 sciencedirect

Role of coherence in adiabatic search algorithms

Feng-guang Li; Wan-Su Bao; Shuo Zhang; He-liang Huang; Xiang-qun Fu

Physics Letters A . 2018-09-29 sciencedirect

Examining linkages between Smart Villages and Smart Cities: Learning from rural youth accessing the internet in India

Shailaja Fennell; Prabhjot Kaur; Ashok Jhunjhunwala; Deapika Narayanan; Yaadveer Singh

Telecommunications Policy . 2018-11-30 sciencedirect

Algorithmic design issues in adaptive differential evolution schemes: Review and taxonomy

Rawaa Dawoud Al-Dabbagh; Ferrante Neri; Norisma Idris; Mohd Sapiyan Baba

Swarm and Evolutionary Computation . 2018-12-31 sciencedirect

Sudanese refugee youth and educational success: The role of church and youth group in supporting cultural and academic adjustment and schooling achievement

Jane Wilkinson; Ninetta Santoro; Jae Major

International Journal of Intercultural Relations . 2017-09-30 sciencedirect

A GMRES-Power algorithm for computing PageRank problems

Chuanqing Gu; Xianglong Jiang; Chenchen Shao; Zhibing Chen

Journal of Computational and Applied Mathematics . 2018-12-01 sciencedirect

Traffic speed cloud maps: A new method for analyzing macroscopic traffic flow

Jianli Xiao; Zhonghao Wang

Physica A: Statistical Mechanics and its Applications . 2018-10-15 sciencedirect

The role of school climate in rates of depression and suicidal ideation among school-attending foster youth in California public schools

Holly Shim-Pelayo; Kris Tunac De Pedro

Children and Youth Services Review . 2018-05-31 sciencedirect

Randomized OBDD-based graph algorithms

Theoretical Computer Science . 2018-12-03 sciencedirect

Temporal traffic smoothing for IoT traffic in mobile networks

Yoshinobu Yamada; Ryoichi Shinkuma; Takanori Iwai; Takeo Onishi; Kozo Satoda

Computer Networks . 2018-12-09 sciencedirect

Mobilities and the network of personal technologies: Refining the understanding of mobility structure

Leopoldina Fortunati; Sakari Taipale

Telematics and Informatics . 2017-05-31 sciencedirect

On the performance of phonetic algorithms in microtext normalization

Yerai Doval; Manuel Vilares; Jesús Vilares

Expert Systems with Applications . 2018-12-15 sciencedirect

A new lattice model of traffic flow considering driver’s anticipation effect of the traffic interruption probability

Guanghan Peng; Hua Kuang; Li Qing

Physica A: Statistical Mechanics and its Applications . 2018-10-01 sciencedirect

“It was kind of a given that we were all multilingual”: Transnational youth identity work in digital translanguaging

Linguistics and Education . 2018-02-28 sciencedirect

Statistical algorithms for particle trajectography

Computer Physics Communications . 2018-11-30 sciencedirect

REIsearch NEWS

news
02/07/2019

The Artists Using Artificial Intelligence to Dream Up the Future of Music

news
28/06/2019

Artificial intelligence needs guardrails

news
25/06/2019

AI and Cybersecurity: Understanding the Advantages and Limitations

news
24/06/2019

Marriott: Hackers accessed more than 5 million passport numbers during November’s massive data breach

news
21/06/2019

How Is AI Working For Health Care?

news
18/06/2019

The Incredible Ways Artificial Intelligence Is Now Used In Mental Health

news
14/06/2019

Why AI and Machine Learning Will Change Cybersecurity

news
11/06/2019

The Incredible Autonomous Ships Of The Future: Run By Artificial Intelligence Rather Than A Crew

news
07/06/2019

Artificial intelligence takes centre stage

news
04/06/2019

Merging Internet Of Things And Blockchain In Preparation For The Future

news
29/05/2019

Computer Scientists Expand the Frontier of Verifiable Knowledge

news
29/05/2019

The World Economic Forum wants to develop global rules for AI

news
08/04/2019

Unlocking the potential of the internet of things

news
07/04/2019

The Secret To Comprehensive, Scalable And Effective Cybersecurity

news
06/04/2019

Google and Microsoft Warn That AI May Do Dumb Things

news
05/04/2019

Data: The Fuel Powering AI & Digital Transformation

news
04/04/2019

Drones and big data: the next frontier in the fight against wildlife extinction

news
03/04/2019

Inside Finland’s plan to become an artificial intelligence powerhouse

news
02/04/2019

Top Automotive Trends In 2019: A Year Of Wows And Woes

news
01/04/2019

Beyond CSI: How big data is reshaping the world of forensics