Automatic Labelling and Document Clustering for Forensic Analysis

Ms. Raksha K.Mundhe, Prof. Ankush Maind

doi:10.17762/ijritcc.v2i9.3325

PDF

Published: Sep 30, 2014

DOI: https://doi.org/10.17762/ijritcc.v2i9.3325

Ms. Raksha K.Mundhe, Prof. Ankush Maind

Abstract

In computer forensic analysis, retrieved data is in unstructured text, whose analysis by computer examiners is difficult to be performed. In proposed approach the forensic analysis is done very systematically i.e. retrieved data is in unstructured format get particular structure by using high quality well known algorithm and automatic cluster labelling method. Indexing is performed on txt, doc, and pdf file which automatically estimate the number of clusters with automatic labelling to it. In the proposed approach DBSCAN algorithm and K-mean algorithm are used; which makes it very easy to retrieve most relevant information for forensic analysis also the automated methods of analysis are of great interest. In particular, algorithms for clustering documents can facilitate the discovery of new and useful knowledge from the documents under analysis. Two methods are used for document clustering for forensic analysis; the first method uses an x2 test of significance to detect different word usage across categories in the hierarchy which is well suited for testing dependencies when count data is available. The second method selects words which both occur frequently in a cluster and effectively discriminate the given cluster from the other clusters. Finally, we also present and discuss several practical results that can be useful for researchers of forensic analysis.

How to Cite

, M. R. K. P. A. M. (2014). Automatic Labelling and Document Clustering for Forensic Analysis. International Journal on Recent and Innovation Trends in Computing and Communication, 2(9), 2934–2941. https://doi.org/10.17762/ijritcc.v2i9.3325

Issue

Vol. 2 No. 9 (2014): September (2014) Issue

Section

Articles

Make a Submission

Announcements

Call for Papers

January 5, 2026

Call for Papers for the New Issue.
Last Date of Submission: June 30^th, 2026

Imp. Announcement

April 15, 2022

Dear Authors,
We are feeling proud congratulations to all the contributors of IJRITCC. Because The "International Journal on Recent and Innovation Trends in Computing and Communication" has been accepted for Scopus.

Like, Subscribe and Share This Video