A Systematic and Comparative Analysis of Semantic Search Algorithms

Main Article Content

Priya Shelke
Chaitali Shewale
Riddhi Mirajkar
Suruchi Dedgoankar
Pawan Wawage
Riddhi Pawar

Abstract

Users often struggle to discover the information they need online because of the massive volume of data that is readily available as well as being generated every day in the today’s digital age. Traditional keyword-based search engines may not be able to handle complex queries, which could result in irrelevant or insufficient search results. This issue can be solved by semantic search, which utilises machine learning and natural language processing to interpret the meaning and context of a user's query. In this paper we focus on analyzing the BM-25 algorithm, Mean of Word Vectors approach, Universal Sentence Encoder model, and Sentence-BERT model on the CISI Dataset for Semantic Search Task. The results indicate that, the Finetuned SBERT model performs the best.

Article Details

How to Cite
Shelke, P. ., Shewale, C. ., Mirajkar, R. ., Dedgoankar, S. ., Wawage, P. ., & Pawar, R. . (2023). A Systematic and Comparative Analysis of Semantic Search Algorithms. International Journal on Recent and Innovation Trends in Computing and Communication, 11(11s), 222–229. https://doi.org/10.17762/ijritcc.v11i11s.8094
Section
Articles

References

“Total data volume worldwide 2010-2025 | Statista.” https://www.statista.com/statistics/871513/worldwide-data-created/ (accessed Apr. 25, 2023).

“CISI (a dataset for Information Retrieval) | Kaggle.” https://www.kaggle.com/datasets/dmaso01dsta/cisi-a-dataset-for-information-retrieval (accessed Apr. 25, 2023).

E. Mäkelä, “Survey of Semantic Search Research.” [Online]. Available:

W. Wei, P. M. Barnaghi, and A. Bargiela, “Search with Meanings: An Overview of Semantic Search Systems.” [Online]. Available: http://www.w3.org/TR/owl-guide/

Lee, C.-H. ., Noh, H.-R. ., & Kim, K.-C. . (2023). Design of Torque and Power Density Improvement According to the Rotor Shape of IPMSM. International Journal of Intelligent Systems and Applications in Engineering, 11(4s), 174–179. Retrieved from https://ijisae.org/index.php/IJISAE/article/view/2585.

G. Sudeepthi, G. Anuradha, M. B.-I. J. of Computer, and undefined 2012, “A survey on semantic web search engine,” Citeseer, 2012, Accessed: Apr. 25, 2023. [Online].

J. R. Pérez-Agüera, J. Arroyo, J. Greenberg, J. P. Iglesias, and V. Fresno, “Using BM25F for semantic search,” ACM International Conference Proceeding Series, 2010, doi: 10.1145/1863879.1863881.

Thakre, B., Thakre, R., Timande, S., & Sarangpure, V. (2021). An Efficient Data Mining Based Automated Learning Model to Predict Heart Diseases. Machine Learning Applications in Engineering Education and Management, 1(2), 27–33. Retrieved from http://yashikajournals.com/index.php/mlaeem/article/view/17

H. Dong, F. Hussain, E. C.-2008 2nd I. international, and undefined 2008, “A survey in semantic search technologies,” ieeexplore.ieee.org, 2008, Accessed: Apr. 25, 2023. [Online]. Available:

C. Zhai and S. Massung, “Text Data Management and Analysis: A Practical Introduction to Information Retrieval and Text Mining June 2016 https://doi. org/10.1145/2915031.2915054,” dl.acm.org, Accessed: Apr. 25, 2023. [Online]..

H. Wu, R. Luk, K. Wong, K. K.-A. T. on, and undefined 2008, “Interpreting tf-idf term weights as making relevance decisions,” dl.acm.org, vol. 26, no. 3, Jun. 2008, doi: 10.1145/1361684.1361686

Sherje, D. N. . (2021). Content Based Image Retrieval Based on Feature Extraction and Classification Using Deep Learning Techniques. Research Journal of Computer Systems and Engineering, 2(1), 16:22. Retrieved from https://technicaljournals.org/RJCSE/index.php/journal/article/view/14

D A. Singhal, C. Buckley, and M. Mitra, “Pivoted document length normalization,” SIGIR Forum (ACM Special Interest Group on Information Retrieval), pp. 21–29, 1996:

S. Robertson, S. W.-S. P. of the Seventeenth, and undefined 1994, “Some simple effective approximations to the 2-poisson model for probabilistic weighted retrieval,” Springer, pp. 232–241, Aug. 1994

P. Agrawal, “Exploration of Proximity Heuristics in Length Normalization,” Jan. 2017, [Online]. Available: http://arxiv.org/abs/1701.01417.

Similar Articles

1 2 3 4 5 6 7 8 9 10 > >> 

You may also start an advanced similarity search for this article.