GitHub - youngwookim/hive-fdw-for-postgresql: [OBSOLETE] (original) (raw)

Hive FDW for PostgreSQL

This Python module implements the multicorn.ForeignDataWrapper interface to allow you to create foreign tables in PostgreSQL 9.1+ that query to tables in Apache Hive.

Pre-requisites

Installation

  1. Install Multicorn
  2. Install hive-thrift-py
  3. Build the FDW module:
 $ cd hive-fdw-for-postgresql  
 $ python setup.py sdist  
 $ sudo python setup.py install  

or, with easy_install:

 $ cd hive-fdw-for-postgresql  
 $ sudo easy_install .  
  1. In the PostgreSQL client, create an extension and foreign server:
 CREATE EXTENSION multicorn;  
   
 CREATE SERVER multicorn_hive FOREIGN DATA WRAPPER multicorn  
 OPTIONS (  
     wrapper 'hivefdw.HiveForeignDataWrapper'  
 );  

Examples

  1. User can executes simple selects on a remote Hive table:
 CREATE FOREIGN TABLE hive (  
     a varchar,  
     b varchar,  
     c varchar,  
     d varchar  
 ) SERVER multicorn_hive OPTIONS (  
     host 'tb081',  
     port '10000',  
     table 'test'  
 );  
 SELECT * FROM hive;  
  1. Also user can executes selects using a Hive query:
 CREATE FOREIGN TABLE hive_query (  
     x varchar,  
     y varchar,  
     z varchar  
 ) SERVER multicorn_hive OPTIONS (  
     host 'tb081',  
     port '10000',  
     query 'SELECT x,y,z from src'  
 );  
   
 SELECT * from hive_query;