Cassandra File System Over Hadoop Distributed File System

Main Article Content

Mr. Ashish A. Mutha, Miss. Vaishali M. Deshmukh

Abstract

Cassandra is an open source distributed database management system is designed to handle large amounts of data across many commodity servers, provides a high availability with no single point of failure. Cassandra will be offering the robust support for clusters spanning multiple data centers with asynchronous masterless replica which allow low latency operations for all the clients. N oSQL data stores target the unstructured data, which nature has dynamic and a key focus area for "Big Data" research. New generation data can prove costly and also unpractical to administer with databases SQL, due to lack of structure, high scalability and needs for the elasticity. N oSQL data stores such as MongoDB and Cassandra provide a desirable platform for fast and efficient for data queries. The Hadoop Distributed File System is one of many different components and projects contained within the community Hadoop ecosystem. The Apache Hadoop project defines Had oop - DFS as "the primary storage system which is used by Hadoop applications" that enables "reliable, extremely rapid computations". This paper was providing high - level overview of how Hadoop - styled analytics (MapReduce, Pig, Mahout and Hive) can be run on data contained in Apache Cassandra wit hout the need for Hadoop - DFS.

Article Details

How to Cite
, M. A. A. M. M. V. M. D. (2014). Cassandra File System Over Hadoop Distributed File System. International Journal on Recent and Innovation Trends in Computing and Communication, 2(3), 634–637. https://doi.org/10.17762/ijritcc.v2i3.3025
Section
Articles