Distributed Query Optimization for Petabyte-Scale Databases

Main Article Content

Milavkumar Shah, Anila Gogineni

Abstract

It is significant to minimize suboptimal query execution in petabyte scale distributed database systems. This research examines a range of techniques for the improvement of query optimization, namely cost-based optimization, execution in partitioned environment, predicate push-down optimization, dynamic resource management, utilization of data locality, and parallelism. Practical examples in relation to each of the methodologies are discussed in addition to the results illustrating efficiency in terms of time to execute, resources used and costs incurred.  Some vital findings discussed include improved operations with queries where partition pruning was effective for data scans by 90% and parallelism resulted in ten time faster execution. In turn, this study demonstrates how, by following these techniques systematically, practitioners could achieve improvements in efficiency in the field of distributed database environments. The findings highlight the need for agile and reactive optimizations in view of addressing current concerns of large scale big data systems. 

Article Details

How to Cite
Milavkumar Shah, Anila Gogineni. (2022). Distributed Query Optimization for Petabyte-Scale Databases . International Journal on Recent and Innovation Trends in Computing and Communication, 10(10), 223–231. Retrieved from https://ijritcc.org/index.php/ijritcc/article/view/11436
Section
Articles