Hadoop Collaborative platform for beginners to professionals

HBase Vs Cassandra

In this new world of evolving Big Data, it is important to understand  the latest technology. Both Cassandra and HBase have adopted from the original Bigtable definition are NoSQL databases. Both safeguard loss of data from failure of cluster node and have distributed databases. Both are designed i...

Data Improt in Hbase from Map reduce

This article series is basically to help the beginners to Hbase. When I started to learn Hbase I just wanted to play with some small program who can be enabler to for me to perform basic operation with Hbase like table creation in Hbase, putting/pulling(scaning in Hbase) data in Hbase table,some sor...

Interaction with HDFS during Hive Table creation

Hive itself provides RDBMS like feature of creating the table structure and loading data to create tables. Though there is difference between load data in RDBMS and Hive. Hive follows Schema on Read while RDBMS follows Schema on Write. in short Hive does not validate schema at the time of data loadi...

How to check HDFS health?

Hadoop provides an untility to check the hdfs file systems health. The tool scans datanodes for all blocks and prepare a report like below mentioned detail. Hadoop fsck /user/This command will look into blocks of files of user directory. If location is not specified it will start looking file f...

What is block scanner?

Block scanning is process preformed by data nodes to verify the integrity of data stored in blocks. DataNode runs this process which scans all the block replicas available it and verifies that with stored checksums of data blocks. Checksums are stored in text files during block creation.When ever a ...