in this project we are going to implement fuzzy c-means clustering algorithm in java.
first we are going to give a brief look at the algorithm steps, then dive into details of methods we used.
for more details you can visit this link : https://home.deib.polimi.it/matteucc/Clustering/tutorial_html/cmeans.html
- initial membership
- calculate centroides value
- update membership values
- check convergence
there is a loop berween step 2 til step 4, after specific number of iterations or satisfying error condition the algorithm will be stop.
the source code contains a class named "FuzzyClustering" that has some fields and methods, in the below we will give a brief information about them.
here is the fields of FuzzyClustering.java class
- U matrix
- - matrix of membership values with n * m dimensiones (n = dataset size and m = cluster number size)
- iteration
- - #iteration that algorithm perform calculation
- fuzziness
- - value of parameter M in c-mean formula
- epsilon
- - threshold of error between current membership values and prevoius step
next we will describe arguments and functionality of methods
- createRandomData
- - this function get dataset size, min and max range, number of clusters and generate random number with gaussian distribution
- assignInitialMembership
- - initialize first values for membership of data
- calculateClusterCenters
- - this function will calculate value of centroids
- updateMembershipValues
- - this function will update membership values depends on current centroids value
- checkConvergence
- - this function will calculate norm 2 of current U matrix and previous U matrix
after running algorithm two file will be generate, "data_set.csv" and "cluster_center.csv" that contains random data and calculated centroids