Implementation of Clever Crawler

Main Article Content

Anuja Lawankar, Nikhil Mangrulkar

Abstract

Now a days due to duplicate documents in the World Wide Web while crawling, indexing and relevancy. Search engine gives huge number of redundancy data. Because of that, storing data is waste of rankings quality, resources, which is not convenient for users. To optimized this limitation a normalization rules is use to transform all duplicate URLs into the same canonical form and further optimized the result using Jaccard similarity function which optimized the similar text present in the content of the URL’s these function define some threshold value to reduce the duplication.

Article Details

How to Cite
, A. L. N. M. (2016). Implementation of Clever Crawler. International Journal on Recent and Innovation Trends in Computing and Communication, 4(4), 337–339. https://doi.org/10.17762/ijritcc.v4i4.2014
Section
Articles