A DOM-Tree based Representation of Web Document Structure for Web Mining Applications

Main Article Content

Manoj Kumar Sarma, Anjana Kakoti Mahanta

Abstract

Among the three broad areas of Web mining, Web Structure Mining is the method of discovering structure information from either the web hyperlink structure or the web page structure. In order to apply data mining techniques on web pages, a good and efficient representation of web pages is required that could depict the actual hierarchical structure of web pages. The work presented here aims to find out a representation of web documents that could be used as input for different data mining techniques. The present research work further aims at applying this representation for efficient clustering of web documents where clustering will be performed based on not only the web page content but also the structural layout of a web page.

Article Details

How to Cite
, M. K. S. A. K. M. (2017). A DOM-Tree based Representation of Web Document Structure for Web Mining Applications. International Journal on Recent and Innovation Trends in Computing and Communication, 5(6), 1437 –. https://doi.org/10.17762/ijritcc.v5i6.971
Section
Articles