Category: Hadoop

Handling “Hive Metastore not working – Syntax error ‘OPTION SQL_SELECT_LIMIT=DEFAULT’ at line 1”

Problem Description While dropping a hive table the following exception encountered in hive shell. FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. MetaException(message: You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near ‘OPTION SQL_SELECT_LIMIT=DEFAULT’ at line 1) Hive version – 2.3.3 mysql-java-connector version –...

Aggregation on Text Fields in the MapReduce Example

Introduction Following is a simple MapReduce example which is a little different than the standard “Word Count” example in that it takes (tab) delimited text, and counts the occurrences of values in a certain field. More details about the implementing it are described below. package com.npntraining.hadoop.mapreduce; import org.apache.hadoop.fs.Path; import org.apache.hadoop.io.IntWritable; import org.apache.hadoop.io.LongWritable; import org.apache.hadoop.io.Text; import org.apache.hadoop.mapred.*; import java.io.IOException; import java.util.Iterator;...

Changing the Output Delimiter in t

Introduction Hadoop’s default output delimiter (character isolating the output key and value) is tab (“\t”). This post explains the best approach to alter the default Hadoop output delimiter.   Output Delimiter Configuration Property The output delimiter of a Hadoop job can easily be altered by Changing the mapred.textoutputformat.separator configuration property. This property can be set from the code itself or from the...