Big Data Frameworks every programmer should know

Table of Contents1 Introduction1.1 Connection between Big Data and Analytics1.2 Why Big Data is essentials for programmers?1.3 Big Data Frameworks every programmer should know1.4 Hadoop1.4.1 Spark1.4.2 Mahout1.4.3 HBase1.4.4 Hive1.4.5 Pig1.4.6 Logstash Introduction Big Data is a major buzz word in the current technological forefront. Big Data technologies have given rise to the usage of cutting-edge research to practical applications. Machine...

Hive Optimization Techniques in Hadoop 2.x

Enable the below Properties in hive SQL for large volumes of data: SET hive.execution.engine = tez; SET mapreduce.framework.name=yarn-tez; SET tez.queue.name=SIU; SET hive.vectorized.execution.enabled=true; SET hive.auto.convert.join=true; SET hive.compute.query.using.stats = true; SET hive.stats.fetch.column.stats = true; SET hive.stats.fetch.partition.stats = true; SET hive.cbo.enable = true; SET hive.exec.dynamic.partition = true; SET hive.exec.dynamic.partition.mode=nonstrict; SET hive.exec.parallel=true; SET hive.exec.mode.local.auto=true; SET hive.exec.reducers.bytes.per.reducer=1000000000; (Depends on your total size of all tables...