Search icon
Arrow left icon
All Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Newsletters
Free Learning
Arrow right icon
Learning Apache Mahout

You're reading from  Learning Apache Mahout

Product type Book
Published in Mar 2015
Publisher
ISBN-13 9781783555215
Pages 250 pages
Edition 1st Edition
Languages

A Mahout Java example


We will now discuss how to use the clustering algorithm discussed in Java code. Open the MahoutClusteringExample.java file from the chapter7.src package.

k-means

Define the distance measure to be used by the k-means clustering algorithm:

DistanceMeasure measure = new EuclideanDistanceMeasure();

We create the Path variable to the input sequence directory created in the preprocessing step:

Path inputSeq = newPath("clustering_seq")

The next step is to generate the random initial cluster seeds. We create the output directory path, where we save the initial cluster points. The path constructor with two arguments creates a folder with the name of the second argument inside the directory of the first argument. You could use a separate directory for the initial cluster directory too:

Path clusters = newPath(inputSeq, "random-seeds")

The RandomSeedGenerator class has the buildRandom()function for that. It takes as input the Configuration object, the input directory with the sequence...

lock icon The rest of the chapter is locked
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $15.99/month. Cancel anytime}