In our last blog on Future DBA, we discussed on HADOOP -HDFS system. how as we know HDFS management is quite difficult so with the help of Vendors -Cloudera/Hortonworks/MapR we can integrate the tools/utility in a GUI way and can be manage easily and efficiently.
This HDFS data can be retrieved and inserted using HIVE utility which will provide us the access to HDFS data in a SQL like way and we can create a access the data just like sql queries.
Hive requires the Meta store system, can be any RDBMS opensource -MySQL or PostgreSQL or any other RDBMS which will store the metadata on the HIVE and actual data would be stored in HDFS.
HIVE uses Map-Reduce process for retrieving data from HDFS.
So for DBA we can work on HDFS data efficiently and just like our RDBMS.