About Cluster Analysis

This is an article about cluster analysis.

Hello!

Today, we’re going to learn about cluster analysis.

Cluster Analysis is a type of unsupervised learning, which refers to a technique for grouping data with similar characteristics.

This is used to discover hidden structures within the data and to understand them by splitting them into meaningful subsets.

the main concept

Cluster

A set of data with similar properties, the data in the cluster are similar to each other, and the data between clusters have different characteristics.

Method of measuring similarity

In cluster analysis, it is important to measure the similarity between data, so that the data can be grouped properly.

Types of cluster analysis

Hierarchical clustering

It is a method of presenting the data as a hierarchical structure by grouping them sequentially or merges, which is visually represented by a dendrogram.

Non-hierarchical clustering

K-means clustering is a typical example of grouping data according to a predetermined number of clusters.

Utilization

Customer Segmentation

It is used to divide customers into groups with similar characteristics to establish marketing strategies for each group.

Outlier detection

It is used to detect abnormal data.

Natural language processing

It is used to classify documents or words into meaningful groups.

Evaluation

Intra-cluster cohesion

Evaluate the degree of aggregation by measuring the similarity between data in a cluster.

Variance between clusters

Evaluate the variance by measuring the distance between different clusters.

at the end of the day

Cluster analysis is a useful method for identifying data patterns and classifying them into meaningful groups, which are actively used in various fields.

Thank you!