Sometimes we feel System.out.println is a long statement to print. static import may shorten it a bit but it is not recommended, because it results in poor readability. I am just using this situation to explain static import and avoid using it in the below scenario.  

out’ object can be customized. out gets initialized by java runtime environment at startup and it can be changed by developer during execution. Instead of standard output, in default cases when you run a program through command line, the output is printed in the same command window. We can change …

System.out.println is a Java statement that prints the argument passed, into the System.out which is generally stdout. System – is a final class in java.lang package. As per javadoc, “…Among the facilities provided by the System class are standard input, standard output, and error output streams; access to externally defined …

Apache HBase is an open-source, distributed, non-relational database modeled after Google’s Bigtable and written in Java. It provides capabilities similar to Bigtable on top of Hadoop and HDFS (Hadoop Distributed Filesystem) i.e. it provides a fault-tolerant way of storing large quantities of sparse data, which are common in many big …

Table of Contents1 Introduction to NoSQL Database2 Why NoSQL?3 Benefits of NoSQL4 NoSQL Database Categories5 NoSQL Vs SQL Comparison6 ACID Property Introduction to NoSQL Database NoSQL, known as Not only SQL database, provides a mechanism for storage and retrieval of data and is the next generation database . Most of …

A Combiner function is an optional intermediary function which is executed on the Map phase right after the execution of the Mapper is complete. There are 2 primary benefits to use a combiner – Combiners can be used to reduce the amount of data sent to the reducer which increases …

Problem: My hive table has date in the format ‘yyyy-MM-dd HH:mm:ss’ which was obtained from a database table through SQOOP. I need it in my query of the format ‘yyyy-MM-dd ’. Solution: Use a regex expression to extract the required value. When we sqoop in the date value to hive …

Deciding when to use dynamic partitioning can be very challenging. We recommend using dynamic partitioning in these common scenarios: Loading from an existing table that is not partitioned: In this scenario, the user doesn’t employ partitioning initially because the table in question is expected to remain relatively small. However, over …

1. User larger HDFS blocks for better performance If smaller HDFS blocks are used more time would be spend for seeking records on disk. This is a massive overhead when we deal with large files. 2. Always use Combiner if possible for local aggregation Shuffle and Sort is a really …