The figure below shows the major components of Hive and its interactions with Hadoop.
The main components of Hive are:
- Shell: It allows interactive queries (CLI) like MySQL shell connected to database. It also supports web and JDBC client.
- Thrift Server: This server exposes a very simple client API to execute HiveQL statements.
- Metastore: The metastore is the system catalog. All other components of Hive interact with the metastore. Driver: It manages life cycle of HiveQL statement during compilation, optimization and execution. On receiving the HiveQL statement, from the Thrift server or some other interfaces, it creates a session handle which is later used to keep track of statistics like execution time, number of output rows, etc.
- Query Compiler: The compiler is invoked by driver upon receiving a HiveQL statement. The compiler then translates this statement into a plan which consists of a DAG (directed acyclic graph) of map-reduce jobs.
- Execution Engine: The driver submits the individual map-reduce jobs from the DAG to the execution engine in a topological order. The execution engine used by Hive currently is Hadoop.