Optimizing Storage Formats for Data Warehousing Efficiency

Main Article Content

Sivananda Reddy Julakanti, Naga Satya Kiranmayee Sattiraju, Rajeswari Julakanti

Abstract

Data warehousing has become a critical aspect of modern business intelligence and data management. As organizations accumulate vast amounts of structured and unstructured data, the efficiency of data storage formats in a data warehouse directly impacts processing speed, scalability, and cost-effectiveness. This research aims to explore and evaluate various storage formats used in data warehousing, focusing on optimization techniques that can enhance storage efficiency, reduce query time, and lower operational costs. By examining the characteristics of different storage formats, including row-based, column-based, and hybrid formats, this paper provides insights into the selection criteria based on use case scenarios. We utilize a comprehensive analysis involving benchmark testing and performance evaluation on a large-scale dataset, comparing common storage formats like Parquet, ORC, and Avro. The research emphasizes the importance of understanding data access patterns, compression algorithms, and query processing techniques in optimizing storage formats. The findings indicate that tailored storage strategies, depending on the data's nature and usage, can substantially improve performance in both analytical and transactional workloads. The results provide a framework for organizations to optimize their data warehousing systems to enhance overall efficiency.

Article Details

How to Cite
Sivananda Reddy Julakanti. (2021). Optimizing Storage Formats for Data Warehousing Efficiency. International Journal on Recent and Innovation Trends in Computing and Communication, 9(5), 71–78. Retrieved from https://ijritcc.org/index.php/ijritcc/article/view/11291
Section
Articles