Document Clustering with Map Reduce using Hadoop Framework

Main Article Content

M. Satish, M. Ramakrishna Murty

Abstract

Big data is a collection of data sets. It is so enormous and complex that it becomes difficult to processes and analyse using normal database management tools or traditional data processing applications. Big data is having many challenges. The main problem of the big data is store and retrieve of the data from the search engines. Document data is also growing rapidly in the eon of internet. Analysing document data is very important for many applications. Document clustering is the one of the important technique to analyse the document data. It has many applications like organizing large document collection, finding similar documents, recommendation system, duplicate content detection, search optimization. This work is motivated by the reorganization of the need for a well efficient retrieve of the data from massive resources of data repository through the search engines. In this work mainly focused on document clustering for collection of documents in efficient manner using with MapReduce.
DOI: 10.17762/ijritcc2321-8169.150181

Article Details

How to Cite
, M. S. M. R. M. (2015). Document Clustering with Map Reduce using Hadoop Framework. International Journal on Recent and Innovation Trends in Computing and Communication, 3(1), 409–413. https://doi.org/10.17762/ijritcc.v3i1.3829
Section
Articles