Hive Components:

The figure below shows the major components of Hive and its interactions with Hadoop.

The main components of Hive are:

  1. Shell: It allows interactive queries (CLI) like MySQL shell connected to database. It also supports web and JDBC client.
  2. Thrift Server: This server exposes a very simple client API to execute HiveQL statements.
  3. Metastore: The metastore is the system catalog. All other components of Hive interact with the metastore. Driver: It manages life cycle of HiveQL statement during compilation, optimization and execution. On receiving the HiveQL statement, from the Thrift server or some other interfaces, it creates a session handle which is later used to keep track of statistics like execution time, number of output rows, etc.
  4. Query Compiler: The compiler is invoked by driver upon receiving a HiveQL statement. The compiler then translates this statement into a plan which consists of a DAG (directed acyclic graph) of map-reduce jobs.
  5. Execution Engine: The driver submits the individual map-reduce jobs from the DAG to the execution engine in a topological order. The execution engine used by Hive currently is Hadoop.

Naveen P.N

12+ years of experience in IT with vast experience in executing complex projects using Java, Micro Services , Big Data and Cloud Platforms. I found NPN Training Pvt Ltd a India based startup to provide high quality training for IT professionals. I have trained more than 3000+ IT professionals and helped them to succeed in their career in different technologies. I am very passionate about Technology and Training. I have spent 12 years at Siemens, Yahoo, Amazon and Cisco, developing and managing technology.