Skip to main content

Introducing Clustering with a Focus in Marketing and Consumer Analysis

  • Chapter
  • First Online:
Book cover Business and Consumer Analytics: New Ideas

Abstract

Clustering has become an extremely popular methodology for consumer analysis with many business applications. Mainly, when a consumer market needs to be segmented, clustering methodologies are some of the most common ways of doing so nowadays. Clustering, however, is a hugely heterogeneous field in itself with advances and explanations coming from many different disciplines. Clustering has been discussed in debates almost as heated as those about politics or religions, yet still, researchers and professionals agree on the methodology’s usefulness in data analytics. This chapter provides the novice data scientist with a brief introduction and review of the field with links to previous large surveys and reviews for recommended further reading. The clustering contributions in this book focus largely on partitional clustering; hence, this is the type of clustering that will feature more prominently in this chapter. Besides sparking the interest of business and marketing researchers and professionals into this ever evolving methodological field, we aim at inspiring dedicated computer scientists and data analysts to continue to explore the wide application domains coming from business and consumer analytics in which clustering and grouping are making great strides.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 299.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Hardcover Book
USD 379.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    http://intelligent-optimization.org/LIONbook/.

References

  1. Phipps Arabie, J. Douglas Carroll, Wayne DeSarbo, and Jerry Wind. Overlapping clustering: A new method for product positioning. Journal of Marketing Research, 18(3):310–317, 1981.

    Article  Google Scholar 

  2. Ahmed Shamsul Arefin, Carlos Riveros, Regina Berretta, and Pablo Moscato. GPU-FS-kNN: A software tool for fast and scalable kNN computation using GPUs. PLOS ONE, 7(8):1–13, 08 2012.

    Article  Google Scholar 

  3. AhmedShamsul Arefin, Mario Inostroza-Ponta, Luke Mathieson, Regina Berretta, and Pablo Moscato. Clustering Nodes in Large-Scale Biological Networks Using External Memory Algorithms, volume 7017 of Lecture Notes in Computer Science, book section 36, pages 375–386. Springer Berlin Heidelberg, 2011.

    Google Scholar 

  4. George Arimond and Abdulaziz Elfessi. A clustering method for categorical data in tourism market segmentation research. Journal of Travel Research, 39(4):391–397, 2001.

    Article  Google Scholar 

  5. Roberto Battiti and Mauro Brunato. The Lion Way: Machine Learning Plus Intelligent Optimization. LIONlab, University of Trento, Italy, 2014.

    Google Scholar 

  6. J. C. Bezdek and N. R. Pal. Cluster validation with generalized Dunn’s indices. In Proceedings 1995 Second New Zealand International Two-Stream Conference on Artificial Neural Networks and Expert Systems, pages 190–193, Nov 1995.

    Google Scholar 

  7. James C. Bezdek. Cluster validity with fuzzy sets. Journal of Cybernetics, 3(3):58–73, 1973.

    Article  MathSciNet  MATH  Google Scholar 

  8. James C. Bezdek, Chris Coray, Robert Gunderson, and James Watson. Detection and characterization of cluster substructure I. linear structure: Fuzzy c-lines. SIAM Journal on Applied Mathematics, 40(2):339–357, 1981.

    Article  MathSciNet  MATH  Google Scholar 

  9. James C. Bezdek, Chris Coray, Robert Gunderson, and James Watson. Detection and characterization of cluster substructure II. Fuzzy c-varieties and convex combinations thereof. SIAM Journal on Applied Mathematics, 40(2):358-372, 1981.

    Article  MathSciNet  MATH  Google Scholar 

  10. James C. Bezdek, Robert Ehrlich, and William Full. FCM: The fuzzy c-means clustering algorithm. Computers & Geosciences, 10(2):191 – 203, 1984.

    Article  Google Scholar 

  11. Saprativa Bhattacharjee, Anirban Das, Ujjwal Bhattacharya, Swapan K. Parui, and Sudipta Roy. Sentiment analysis using cosine similarity measure. In 2nd IEEE International Conference on Recent Trends in Information Systems, ReTIS 2015, Kolkata, India, July 9-11, 2015, pages 27–32. IEEE, 2015.

    Google Scholar 

  12. CM Bishop. Bishop Pattern Recognition and Machine Learning. Springer, New York, 2001.

    Google Scholar 

  13. Michael J. Brusco and J. Dennis Cradit. A variable-selection heuristic for k-means clustering. Psychometrika, 66(2):249–270, 2001.

    Article  MathSciNet  MATH  Google Scholar 

  14. Tadeusz Caliński and Jerzy Harabasz. A dendrite method for cluster analysis. Communications in Statistics-theory and Methods, 3(1):1–27, 1974.

    Article  MathSciNet  MATH  Google Scholar 

  15. Frank J. Carmone, Ali Kara, and Sarah Maxwell. HINoV: A new model to improve market segment definition by identifying noisy variables. Journal of Marketing Research, 36(4):501–509, 1999.

    Article  Google Scholar 

  16. Mònica Casabayó, Núria Agell, and Germán Sánchez-Hernández. Improved market segmentation by fuzzifying crisp clusters: A case study of the energy market in Spain. Expert Systems with Applications, 42(3):1637 – 1643, 2015.

    Article  Google Scholar 

  17. Kit Yan Chan, C.K. Kwong, and B.Q. Hu. Market segmentation and ideal point identification for new product design using fuzzy data compression and fuzzy clustering methods. Applied Soft Computing, 12(4):1371 – 1378, 2012.

    Article  Google Scholar 

  18. Anil Chaturvedi, E. Paul Green, and Douglas J. Caroll. K-modes clustering. Journal of Classification, 18(1):35–55, 2001.

    Article  MathSciNet  MATH  Google Scholar 

  19. Wen-Yu Chiang. Establishment and application of fuzzy decision rules: an empirical case of the air passenger market in Taiwan. International Journal of Tourism Research, 13(5):447–456, 2011.

    Article  MathSciNet  Google Scholar 

  20. Prabhakar Raghavan Christopher D. Manning and Hinrich Schütze. Introduction to Information Retrieval. Cambridge University Press, 2008.

    Google Scholar 

  21. Jacob Cohen. A coefficient of agreement for nominal scales. Educational and psychological measurement, 20(1):37–46, 1960.

    Article  Google Scholar 

  22. Belur V. Dasarathy. Handbook of Data Mining and Knowledge Discovery, chapter Nearest-Neighbor Approaches, pages 288–298. Oxford University Press, 2002.

    Google Scholar 

  23. David L Davies and Donald W Bouldin. A cluster separation measure. IEEE transactions on pattern analysis and machine intelligence, (2):224–227, 1979.

    Google Scholar 

  24. Natalie J de Vries, Ahmed S Arefin, and Pablo Moscato. Gauging heterogeneity in online consumer behaviour data: A proximity graph approach. In 2014 IEEE Fourth International Conference on Big Data and Cloud Computing (BDCloud), pages 485–492. IEEE, 2014.

    Google Scholar 

  25. Natalie Jane de Vries, Jamie Carlson, and Pablo Moscato. A data-driven approach to reverse engineering customer engagement models: Towards functional constructs. PloS one, 9(7):e102768, 2014.

    Article  Google Scholar 

  26. Natalie Jane de Vries, Rodrigo Reis, and Pablo Moscato. Clustering consumers based on trust, confidence and giving behaviour: data-driven model building for charitable involvement in the Australian not-for-profit sector. PloS one, 10(4):e0122133, 2015.

    Article  Google Scholar 

  27. Bernard Desgraupes. Clustering indices. 2013.

    Google Scholar 

  28. Michel Marie Deza and Elena Deza. Encyclopedia of Distances. Data-Centric Systems and Applications. Springer-Verlag, 3rd edition, 2014.

    Google Scholar 

  29. Giuseppe Di Vita, Gaetano Chinnici, and Mario D’Amico. Clustering attitudes and behaviours of Italian wine consumers. Calitatea, 15:54–61, 03, 2014. Copyright - Copyright Romanian Society for Quality Assurance Mar 2014; Document feature - Tables; Equations; Graphs; Last updated - 2014-03-24.

    Google Scholar 

  30. Sara Dolnicar and Friedrich Leisch. Segmenting markets by bagged clustering. Australasian Marketing Journal (AMJ), 12(1):51 – 65, 2004.

    Article  Google Scholar 

  31. Margaret H. Dunham. Data Mining Introductory and Advanced Topics. Pearson Education, 2nd edition, 2003.

    Google Scholar 

  32. J. C. Dunn. A fuzzy relative of the ISODATA process and its use in detecting compact well-separated clusters. Cybernetics and Systems, 3(3):32–57, 1973.

    MathSciNet  MATH  Google Scholar 

  33. Pierpaolo D’Urso and Paolo Giordani. A weighted fuzzy c-means clustering model for fuzzy data. Computational Statistics & Data Analysis, 50(6):1496 – 1523, 2006.

    Article  MathSciNet  MATH  Google Scholar 

  34. Martin Ester, Hans-Peter Kriegel, Jörg Sander, and Xiaowei Xu. A density-based algorithm for discovering clusters in large spatial databases with noise. pages 226–231. AAAI Press, 1996.

    Google Scholar 

  35. Alberto Fernández and Sergio Gómez. Solving non-uniqueness in agglomerative hierarchical clustering using multidendrograms. Journal of Classification, 25(1):43–65, 2008.

    Article  MathSciNet  MATH  Google Scholar 

  36. Maria Brigida Ferraro and Paolo Giordani. A toolbox for fuzzy clustering using the r programming language. Fuzzy Sets and Systems, 279:1 – 16, 2015. Theme: Data, Audio and Image Analysis.

    Google Scholar 

  37. R. A. Fisher. The use of multiple measurements in taxonomic problems. Annals of Eugenics, 7(2):179–188, 1936.

    Article  Google Scholar 

  38. Joseph L Fleiss. Measuring nominal scale agreement among many raters. Psychological bulletin, 76(5):378, 1971.

    Article  Google Scholar 

  39. E.W. Forgy. Cluster analysis of multivariate data: Efficiency vs. interpretability of classifications. Biometrics, 21:768–769, 1965.

    Google Scholar 

  40. Edward B Fowlkes and Colin L Mallows. A method for comparing two hierarchical clusterings. Journal of the American statistical association, 78(383):553–569, 1983.

    Article  MATH  Google Scholar 

  41. Hichem Frigui and Raghu Krishnapuram. Clustering by competitive agglomeration. Pattern Recognition, 30(7):1109 – 1119, 1997.

    Article  Google Scholar 

  42. Guojun Gan, Chaoqun Ma, and Jianhong Wu. Data clustering: theory, algorithms, and applications, volume 20 of ASA-SIAM Series on Statistics and Applied Probability. Siam, Philadelphia, 2007.

    Google Scholar 

  43. Jan Gorodkin. Comparing two k-category assignments by a k-category correlation coefficient. Computational biology and chemistry, 28(5):367–374, 2004.

    Article  MATH  Google Scholar 

  44. M. Halkidi, M. Vazirgiannis, and Y. Batistakis. Quality Scheme Assessment in the Clustering Process, pages 265–276. Springer, Berlin, Heidelberg, 2000.

    MATH  Google Scholar 

  45. Maria Halkidi, Yannis Batistakis, and Michalis Vazirgiannis. On clustering validation techniques. Journal of Intelligent Information Systems, 17(2):107–145, 2001.

    Article  MATH  Google Scholar 

  46. Maria Halkidi and Michalis Vazirgiannis. Clustering validity assessment: Finding the optimal partitioning of a data set. In Proceedings of the 2001 IEEE International Conference on Data Mining, ICDM ’01, pages 187–194, Washington, DC, USA, 2001. IEEE Computer Society.

    Google Scholar 

  47. Kevin A Hallgren. Computing inter-rater reliability for observational data: an overview and tutorial. Tutorials in quantitative methods for psychology, 8(1):23, 2012.

    Article  Google Scholar 

  48. Jiawei Han and Micheline Kamber. Data Mining: Concepts and Techniques. The Morgan Kaufmann Series in Data Management Systems.

    Google Scholar 

  49. H. Hruschka. Market definition and segmentation using fuzzy clustering methods. International Journal of Research in Marketing, 3(2):117 – 134, 1986.

    Article  Google Scholar 

  50. Jih-Jeng Huang, Gwo-Hshiung Tzeng, and Chorng-Shyong Ong. Marketing segmentation using support vector clustering. Expert Systems with Applications, 32(2):313 – 317, 2007.

    Article  Google Scholar 

  51. Zhexue Huang. Extensions to the k-means algorithm for clustering large data sets with categorical values. Data Mining and Knowledge Discovery, 2(3):283–304, 1998.

    Article  Google Scholar 

  52. Zhexue Huang and Michael K. Ng. A fuzzy k-modes algorithm for clustering categorical data. IEEE Transactions on Fuzzy Systems, 7(4):446–452, 1999.

    Article  Google Scholar 

  53. Mario Inostroza-Ponta, Regina Berretta, Alexandre Mendes, and Pablo Moscato. An automatic graph layout procedure to visualize correlated data, pages 179–188. Springer, 2006.

    Google Scholar 

  54. Mario Inostroza-Ponta, Alexandre Mendes, Regina Berretta, and Pablo Moscato. An integrated QAP-based approach to visualize patterns of gene expression similarity, pages 156–167. Springer, 2007.

    Google Scholar 

  55. Kyoung jae Kim and Hyunchul Ahn. A recommender system using {GA} k-means clustering in an online shopping market. Expert Systems with Applications, 34(2):1200 – 1209, 2008.

    Article  Google Scholar 

  56. Anil K. Jain. Data clustering: 50 years beyond k-means. Pattern Recognition Letters, 31(8):651 – 666, 2010. Award winning papers from the 19th International Conference on Pattern Recognition (ICPR)19th International Conference in Pattern Recognition (ICPR).

    Google Scholar 

  57. Anil K. Jain, M. Narasimha Murty, and Patrick J. Flynn. Data clustering: A review. ACM Comput. Surv., 31(3):264–323, 1999.

    Google Scholar 

  58. Giuseppe Jurman, Samantha Riccadonna, and Cesare Furlanello. A comparison of MCC and CEN error measures in multi-class prediction. PLOS ONE, 7(8):1–8, 08 2012.

    Google Scholar 

  59. Ah Keng Kau and Pei Shan Lim. Clustering of Chinese tourists to Singapore: an analysis of their motivations, values and satisfaction. International Journal of Tourism Research, 7(4-5):231–248, 2005.

    Article  Google Scholar 

  60. Leonard Kaufman and Peter J. Rousseeuw. Finding Groups in Data: An Introduction to Cluster Analysis. Wiley Series in Probability and Mathematical Statistics. John Wiley & Sons, Inc., Hoboken, New Jersey, 1990.

    Book  MATH  Google Scholar 

  61. Leonard Kaufman and Peter J. Rousseeuw. Finding Groups in Data: An Introduction to Cluster Analysis. Wiley Series in Probability and Mathematical Statistics. John Wiley & Sons, Inc., Hoboken, New Jersey, 2005.

    MATH  Google Scholar 

  62. Navneet Kaur and Craig M. Gelowitz. A tweet grouping methodology utilizing inter and intra cosine similarity. In IEEE 28th Canadian Conference on Electrical and Computer Engineering, CCECE 2015, Halifax, NS, Canada, May 3-6, 2015, pages 756–759. IEEE, 2015.

    Google Scholar 

  63. D. Kavyasrujana and B. Chakradhara Rao. Hierarchical Clustering for Sentence Extraction Using Cosine Similarity Measure, pages 185–191. Springer International Publishing, Cham, 2015.

    Google Scholar 

  64. Minho Kim and R.S. Ramakrishna. New indices for cluster validity assessment. Pattern Recognition Letters, 26(15):2353 – 2363, 2005.

    Article  Google Scholar 

  65. Frank Klawonn, Rudolf Kruse, and Roland Winkler. Fuzzy clustering: More than just fuzzification. Fuzzy Sets and Systems, 281:272 – 279, 2015. Special Issue Celebrating the 50th Anniversary of Fuzzy Sets.

    Google Scholar 

  66. Mirella Kleijnen, Ko de Ruyter, and Martin Wetzels. Consumer adoption of wireless services: Discovering the rules, while playing the game. Journal of Interactive Marketing, 18(2):51 – 61, 2004.

    Article  Google Scholar 

  67. Solomon Kullback. Information theory and statistics. Courier Corporation, 1997.

    MATH  Google Scholar 

  68. R.J. Kuo, Y.L. An, H.S. Wang, and W.J. Chung. Integration of self-organizing feature maps neural network and genetic k-means algorithm for market segmentation. Expert Systems with Applications, 30(2):313 – 324, 2006.

    Article  Google Scholar 

  69. R.J. Kuo, L.M. Ho, and C.M. Hu. Integration of self-organizing feature map and k-means algorithm for market segmentation. Computers & Operations Research, 29(11):1475 – 1493, 2002.

    Article  MATH  Google Scholar 

  70. M. Sh. Levin. Combinatorial clustering: Literature review, methods, examples. Journal of Communications Technology and Electronics, 60(12):1403–1428, 2015.

    Article  Google Scholar 

  71. Q. Lin and Y. Wan. Mobile customer clustering based on call detail records for marketing campaigns. In 2009 International Conference on Management and Service Science, pages 1–4, Sept 2009.

    Google Scholar 

  72. Bing Liu. Web Data Mining: Exploring Hyperlinks, Contents, and Usage Data. Data-Centric Systems and Applications. Springer-Verlag, 2nd edition, 2008.

    Google Scholar 

  73. Ying Liu, Hong Li, Geng Peng, Benfu Lv, and Chong Zhang. Online purchaser segmentation and promotion strategy selection: evidence from Chinese e-commerce market. Annals of Operations Research, 233(1):263–279, 2013.

    Article  MATH  Google Scholar 

  74. Ying Liu, Sudha Ram, Robert F. Lusch, and Michael Brusco. Multicriterion market segmentation: A new model, implementation, and evaluation. Marketing Science, 29(5):880–894, 2010.

    Article  Google Scholar 

  75. Benjamin Lucas, Ahmed Shamsul Arefin, Natalie Jane de Vries, Regina Berretta, Jamie Carlson, and Pablo Moscato. Engagement in motion: Exploring short term dynamics in page-level social media metrics. In 2014 IEEE Fourth International Conference on Big Data and Cloud Computing, BDCloud 2014, Sydney, Australia, December 3-5, 2014, pages 334–341, 2014.

    Google Scholar 

  76. James MacQueen et al. Some methods for classification and analysis of multivariate observations. In Proceedings of the fifth Berkeley symposium on mathematical statistics and probability, volume 1, pages 281–297. Oakland, CA, USA., 1967.

    Google Scholar 

  77. Katariina Mäenpää. Clustering the consumers on the basis of their perceptions of the internet banking services. Internet Research, 16(3):304–322, 2006.

    Article  Google Scholar 

  78. Pritha Mahata, Wagner Costa, Carlos Cotta, and Pablo Moscato. Hierarchical clustering, languages and cancer. In Franz Rothlauf, Jürgen Branke, Stefano Cagnoni, Ernesto Costa, Carlos Cotta, Rolf Drechsler, Evelyne Lutton, Penousal Machado, Jason H. Moore, Juan Romero, George D. Smith, Giovanni Squillero, and Hideyuki Takagi, editors, Applications of Evolutionary Computing, EvoWorkshops 2006: EvoBIO, EvoCOMNET, EvoHOT, EvoIASP, EvoINTERACTION, EvoMUSART, and EvoSTOC, Budapest, Hungary, April 10-12, 2006, Proceedings, volume 3907 of Lecture Notes in Computer Science, pages 67–78. Springer, 2006.

    Google Scholar 

  79. Brian W Matthews. Comparison of the predicted and observed secondary structure of t4 phage lysozyme. Biochimica et Biophysica Acta (BBA)-Protein Structure, 405(2):442–451, 1975.

    Article  Google Scholar 

  80. Marina Meilă. Comparing Clusterings by the Variation of Information, pages 173–187. Springer, Berlin, Heidelberg, 2003.

    MATH  Google Scholar 

  81. Volodymyr Melnykov and Ranjan Maitra. Finite mixture models and model-based clustering. Statist. Surv., 4:80–116, 2010.

    Article  MathSciNet  MATH  Google Scholar 

  82. Henriette Müller and Ulrich Hamm. Stability of market segmentation with cluster analysis – a methodological approach. Food Quality and Preference, 34:70 – 78, 2014.

    Article  Google Scholar 

  83. Leila M Naeni, Hugh Craig, Regina Berretta, and Pablo Moscato. A novel clustering methodology based on modularity optimisation for detecting authorship affinities in Shakespearean era plays. PLoS One, 11(8):e0157988, 2016.

    Article  Google Scholar 

  84. Morteza Namvar and Mohammad R. Gholamian. A two phase clustering method for intelligent customer segmentation. In Proceedings of the International Conference on Intelligent Systems, Modelling and Simulation, pages 61–68, Liverpool, UK, 2010. IEEE.

    Google Scholar 

  85. S.R. Nanda, B. Mahanty, and M.K. Tiwari. Clustering Indian stock market data for portfolio management. Expert Systems with Applications, 37(12):8793 – 8798, 2010.

    Article  Google Scholar 

  86. R. T. Ng and Jiawei Han. CLARANS: a method for clustering objects for spatial data mining. IEEE Transactions on Knowledge and Data Engineering, 14(5):1003–1016, Sep 2002.

    Article  Google Scholar 

  87. Taher Niknam and Babak Amiri. An efficient hybrid approach based on PSO, ACO and k-means for cluster analysis. Applied Soft Computing, 10(1):183 – 197, 2010.

    Article  Google Scholar 

  88. Łukasz P. Olech and Mariusz Paradowski. Hierarchical Gaussian Mixture Model with Objects Attached to Terminal and Non-terminal Dendrogram Nodes, pages 191–201. Springer International Publishing, Cham, 2016.

    Google Scholar 

  89. Lucie K. Ozanne and Paul W. Ballantine. Sharing as a form of anti-consumption? An examination of toy library users. Journal of Consumer Behaviour, 9(6):485–498, 2010.

    Article  Google Scholar 

  90. Muammer Ozer. User segmentation of online music services using fuzzy clustering. Omega, 29(2):193 – 206, 2001.

    Article  MathSciNet  Google Scholar 

  91. Hae-Sang Park and Chi-Hyuck Jun. A simple and fast algorithm for k-medoids clustering. Expert Systems with Applications, 36(2, Part 2):3336 – 3341, 2009.

    Article  Google Scholar 

  92. Girish Punj and David W. Stewart. Cluster analysis in marketing research: Review and suggestions for application. Journal of Marketing Research, 20(2):pp. 134–148, 1983.

    Article  Google Scholar 

  93. M. Rezaei and P. Fränti. Set matching measures for external cluster validity. IEEE Transactions on Knowledge and Data Engineering, 28(8):2173–2186, Aug 2016.

    Article  Google Scholar 

  94. Romeo Rizzi, Pritha Mahata, Luke Mathieson, and Pablo Moscato. Hierarchical clustering using the arithmetic-harmonic cut: Complexity and experiments. PLOS ONE, 5(12):1–8, 12 2010.

    Article  Google Scholar 

  95. Peter J Rousseeuw. Silhouettes: a graphical aid to the interpretation and validation of cluster analysis. Journal of computational and applied mathematics, 20:53–65, 1987.

    Article  MATH  Google Scholar 

  96. Enrique H. Ruspini. A new approach to clustering. Information and Control, 15(1):22 – 32, 1969.

    Article  MATH  Google Scholar 

  97. Jörg Sander, Martin Ester, Hans-Peter Kriegel, and Xiaowei Xu. Density-based clustering in spatial databases: The algorithm GDBSCAN and its applications. Data Mining and Knowledge Discovery, 2(2):169–194, 1998.

    Article  Google Scholar 

  98. William A Scott. Reliability of content analysis: The case of nominal scale coding. Public opinion quarterly, pages 321–325, 1955.

    Google Scholar 

  99. Claude Elwood Shannon. A mathematical theory of communication. ACM SIGMOBILE Mobile Computing and Communications Review, 5(1):3–55, 2001.

    Google Scholar 

  100. Padhraic Smyth. Handbook of Data Mining and Knowledge Discovery, chapter 16.5 Clustering, pages 386–388. Oxford University Press, 2002.

    Google Scholar 

  101. Michał Spytkowski, Łukasz P. Olech, and Halina Kwaśnicka. Hierarchy of Groups Evaluation Using Different F-Score Variants, pages 654–664. Springer, Berlin, Heidelberg, 2016.

    Google Scholar 

  102. Douglas. Steinley. K-means clustering: A half-century synthesis. British Journal of Mathematical and Statistical Psychology, 59(1):1–34, 2006.

    Article  MathSciNet  Google Scholar 

  103. Douglas Steinley and Michael J. Brusco. A new variable weighting and selection procedure for k-means cluster analysis. Multivariate Behavioral Research, 43(1):77–108, 2008. PMID: 26788973.

    Article  Google Scholar 

  104. Michio Sugeno and Takahiro Yasukawa. A fuzzy-logic-based approach to qualitative modeling. IEEE Transactions on fuzzy systems, 1(1):7–31, 1993.

    Article  Google Scholar 

  105. Michael Nche Tuma, Reinhold Decker, and Sören Scholz. A survey of the challenges and pitfalls of cluster analysis application in market segmentation. International Journal of Market Research, 53(3):391–414, 2011.

    Article  Google Scholar 

  106. VI Vagizova, KM Lurie, and Ihor Bogdanovych Ivasiv. Clustering of Russian banks: business models of interaction of the banking sector and the real economy. Problems and perspectives in management, 12(1):83–93, 2014.

    Google Scholar 

  107. C. J. van Rijsbergen. Information Retrieval. Butterworth, 1979.

    MATH  Google Scholar 

  108. Silke Wagner and Dorothea Wagner. Comparing clusterings: an overview. Universität Karlsruhe, Fakultät für Informatik Karlsruhe, 2007.

    Google Scholar 

  109. Anongnart Srivihok Waminee Niyagas and Sukumal Kitisin. Clustering e-banking customer using data mining and marketing segmentation. Transactions on Computer and Information Technology (ECTI-CIT), 2(1), 2006.

    Google Scholar 

  110. Jiaqi Wang, Xindong Wu, and Chengqi Zhang. Support vector machines based on k-means clustering for real-time business intelligence systems. International Journal of Business Intelligence and Data Mining, 1(1):54–64, 2005.

    Article  Google Scholar 

  111. Michel Wedel and Jan-Benedict E.M. Steenkamp. A fuzzy clusterwise regression approach to benefit segmentation. International Journal of Research in Marketing, 6(4):241 – 258, 1989.

    Article  Google Scholar 

  112. William G. Wee and K. S. Fu. A formulation of fuzzy automata and its application as a model of learning systems. IEEE Transactions on Systems Science and Cybernetics, 5(3):215 – 223, 1969.

    Article  MATH  Google Scholar 

  113. Ian H. Witten and Eibe Frank. Data Mining: Practical Machine Learning Tools and Techniques. Morgan Kaufmann Publishers Elsevier, 2nd edition, 2005.

    Google Scholar 

  114. Jing Wu and Zheng Lin. Research on customer segmentation model by clustering. In Proceedings of the 7th International Conference on Electronic Commerce, ICEC ’05, pages 316–318, New York, NY, USA, 2005. ACM.

    Google Scholar 

  115. Zhengrong Xiang and Md Zahidul Islam. Hartigan’s method for k-modes clustering and its advantages. In Proceedings of the Twelfth Australasian Data Mining Conference (AusDM 2014), Brisbane, Australia, pages 25–30. Australian Computer Society Inc., 2014.

    Google Scholar 

  116. Rui Xu and Donald C. Wunsch II. Clustering. IEEE Press Series on Computational Intelligence. John Wiley & Sons, Inc., Hoboken, New Jersey, 2009.

    Google Scholar 

Download references

Acknowledgements

We would like to thank Ademir Cristiano Gabardo, Luke Mathieson and Shannon Fenn for their technical and proofreading help with this chapter. Łukasz P. Olech is supported by “THELXINOE: Erasmus Euro-Oceanian Smart City Network”, Erasmus Mundus Action-2, Strand-2 (EMA2/S2), project funded by the European Union. Pablo Moscato acknowledges previous support from the Australian Research Council Future Fellowship FT120100060 and Australian Research Council Discovery Projects DP120102576 and DP140104183, and his academic partners in their funded project THELXINOE.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Natalie Jane de Vries .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Switzerland AG

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

de Vries, N.J., Olech, Ł.P., Moscato, P. (2019). Introducing Clustering with a Focus in Marketing and Consumer Analysis. In: Moscato, P., de Vries, N. (eds) Business and Consumer Analytics: New Ideas. Springer, Cham. https://doi.org/10.1007/978-3-030-06222-4_3

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-06222-4_3

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-06221-7

  • Online ISBN: 978-3-030-06222-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics