Category: Hadoop

Hive Optimization Techniques in Hadoop 2.x

Enable the below Properties in hive SQL for large volumes of data: SET hive.execution.engine = tez; SET mapreduce.framework.name=yarn-tez; SET tez.queue.name=SIU; SET hive.vectorized.execution.enabled=true; SET hive.auto.convert.join=true; SET hive.compute.query.using.stats = true; SET hive.stats.fetch.column.stats = true; SET hive.stats.fetch.partition.stats = true; SET hive.cbo.enable = true; SET hive.exec.dynamic.partition = true; SET hive.exec.dynamic.partition.mode=nonstrict; SET hive.exec.parallel=true; SET hive.exec.mode.local.auto=true; SET hive.exec.reducers.bytes.per.reducer=1000000000; (Depends on your total size of all tables...