Market Value For Olive Oil In Chile

K-Means Cluster Analysis

Chapter 3 PPDM Cl Class

Introduction to Data Mining

What is Cluster Analysis?
Finding groups of objects such that the objects in a group will be similar (or related) to one another and different from (or unrelated to) the objects in other groups
Intra-cluster distances are minimized Inter cluster Inter-cluster distances are maximized

Applications of Cluster Analysis
Understanding
– Group related documents p for browsing, group genes and proteins that have similar functionality, or group stocks with similar price fluctuations
clusters – Can represent multiple classes or ‘border’ points

Fuzzy versus non-fuzzy y y
– In fuzzy clustering, a point belongs to every cluster with some weight between 0 and 1 – Weights must sum to 1 – Probabilistic clustering has similar characteristics

Partial versus complete
– I some cases, we only want to cluster some of the data In l tt l t f th d t

Heterogeneous versus homogeneous
– Cluster of widely different sizes, shapes, and densities
Types of Clusters
Well-separated clusters Center-based clusters Contiguous clusters Density-based clusters Property or Conceptual P t C t l Described by an Objective Function
Types of Clusters: Well-Separated Well-Separated Clusters:
– A cluster is a set of points such that any point in a cluster is closer (or more similar) to every other point in the cluster than to any point not in the cluster.

3 well-separated clusters
Types of Clusters: Center-Based Center-based
– A cluster is a set of objects such that an object in a cluster is closer (more similar) to the “center” of a cluster, than to the center of any other cluster – The center of a cluster is often a centroid, the average of all the i t in the l t th points i th cluster, or a medoid, th most “representative” d id the t“ t ti ” point of a cluster

4 center-based clusters
Types of Clusters: Contiguity-Based Contiguous Cluster (Nearest neighbor or Transitive)
– A cluster is a set of points such that a point in a cluster is closer (or more similar) to one or more other points in the cluster than to any point not in the cluster.

8 contiguous clusters
Types of Clusters: Density-Based Density-based
– A cluster is a dense region of points, which is separated by low-density regions, from other regions of high density. – Used when the clusters are irregular or intertwined, and when noise and outliers are present.

6 density-based clusters
Types of Clusters: Conceptual Clusters Shared Property or Conceptual Clusters
– Finds clusters that share some common property or represent a particular concept. .

2 Overlapping Circles
Types of Clusters: Objective Function Clusters Defined by an Objective Function
– Finds clusters that minimize or maximize an objective function. – Enumerate all possible ways of dividing the points into clusters and evaluate the `goodness' of each potential set of clusters by using the given objective function (NP Hard) function. – Can have global or local objectives.
Hierarchical clustering algorithms typically have local

