Implementation of Clever Crawler
Main Article Content
Abstract
Now a days due to duplicate documents in the World Wide Web while crawling, indexing and relevancy. Search engine gives huge number of redundancy data. Because of that, storing data is waste of rankings quality, resources, which is not convenient for users. To optimized this limitation a normalization rules is use to transform all duplicate URLs into the same canonical form and further optimized the result using Jaccard similarity function which optimized the similar text present in the content of the URL’s these function define some threshold value to reduce the duplication.
Article Details
How to Cite
, A. L. N. M. (2016). Implementation of Clever Crawler. International Journal on Recent and Innovation Trends in Computing and Communication, 4(4), 337–339. https://doi.org/10.17762/ijritcc.v4i4.2014
Section
Articles