Extract date in required formats from hive tables

Problem:
My hive table has date in the format ‘yyyy-MM-dd HH:mm:ss’ which was obtained from a database table through SQOOP. I need it in my query of the format ‘yyyy-MM-dd ’.

Solution:
Use a regex expression to extract the required value.
When we sqoop in the date value to hive from rdbms, the data type hive uses to store that date is String. You can use a substring functions to achieve the same, the most easiest way would be to use the regexp_extract() function in hive.

To get the date alone in ‘yyyy-MM-dd ’ format we can go in for
regexp_extract(column_datetime,'(.*\-.*\-.*)\\s(.*)’,1)
You can use it as Select *, regexp_extract(column_datetime,'(.*\-.*\-.*)\\s(.*)’,1) from sample_table;

**column_datetime is a column of type string that stores date in the format ‘yyyy-MM-dd HH:mm:ss’ which is part of hive table sample_table

Some possible cases
Case: 01
If you want the timestamp alone regexp_extract(column_datetime,'(.*)\\s(.*)’,2)

Case: 02
If you want the year and month alone of the format ‘yyyy-MM’ then useregexp_extract(column_datetime,'(.*\-.*)\-.*’,1)

Case: 03
All these were mere String operations done using regular expressions now if I want my date in the format ‘dd-MM- yyyy’ it’d be better to go in for a combination date functions and string concatenation functions. We can achieve our goal using this combination of functions

concat(year(column_datetime),'-',month(column_datetime),'-',day(column_datetime) )

 

So we can use the same in our query like

Select *, concat(day(column_datetime),'-',month(column_datetime),'-', year(column_datetime))

 

from sample_table;

But there would be issues in using this that if dates and month values are less than 10 it would be represented with a single character, we can overcome the same with the use of case functionality as

concat(CASE WHEN day(install_datetime) < 10 THEN concat('0',day(install_datetime)) ELSE 
trim(day(install_datetime)) END,'-',CASE WHEN month(install_datetime) < 10 THEN 
concat('0',month(install_datetime)) ELSE trim(month(install_datetime)) END,'-',
year(install_datetime))

 

A sample usage in Hive QL would be as

select *,concat(CASE WHEN day(install_datetime) < 10 THEN concat('0',day(install_datetime)) 
ELSE trim(day(install_datetime)) END,'-',CASE WHEN month(install_datetime) < 10 THEN 
concat('0',month(install_datetime)) ELSE trim(month(install_datetime)) END,'-',
year(install_datetime)) <br>from sample_table;