_대문 | 방명록 | 최근글 | 홈피소개 | 주인놈 |
FrontPage › Hive서버와HivePython클라이언트사용하기
|
|
[edit]
2 Hive #
[edit]
3 Apache Thrift #
[edit]
4 Hive Server #Hive 서버는 Thrift 서버로 동작한다.
서버 시작
$ hive --service hiveserver [1] 9818 $ Starting Hive Thrift Server 09/12/17 16:59:39 INFO service.HiveServer: Starting hive server on port 10000 . . $ [edit]
5 Hive Python 클라이언트 #Hadoop & Hive 설치 및 확인 (Hadoop 0.20.1 & Hive 0.4.0)
$ rpm -qa | grep hadoop-0.20 hadoop-0.20-jobtracker-0.20.1+133-1 hadoop-0.20-libhdfs-0.20.1+133-1 hadoop-0.20-tasktracker-0.20.1+133-1 hadoop-0.20-0.20.1+133-1 hadoop-0.20-datanode-0.20.1+133-1 hadoop-0.20-secondarynamenode-0.20.1+133-1 hadoop-0.20-conf-pseudo-0.20.1+133-1 hadoop-0.20-pipes-0.20.1+133-1 hadoop-0.20-namenode-0.20.1+133-1 hadoop-0.20-native-0.20.1+133-1 hadoop-0.20-docs-0.20.1+133-1 $ $ rpm -qa | grep hive hadoop-hive-webinterface-0.4.0+14-1 hadoop-hive-0.4.0+14-1 샘플 데이터
$ cat /tmp/r.txt a 1 1.0 b 2 2.0 c 3 3.0 $ PYTHONPATH 설정 (Hive Python 라이브러리)
$ export PYTHONPATH="/usr/lib/hive/lib/py" $ env | grep PYTHONPATH PYTHONPATH=/usr/lib/hive/lib/py 코드
import sys from hive_service import ThriftHive from hive_service.ttypes import HiveServerException from thrift import Thrift from thrift.transport import TSocket from thrift.transport import TTransport from thrift.protocol import TBinaryProtocol try: transport = TSocket.TSocket('localhost', 10000) transport = TTransport.TBufferedTransport(transport) protocol = TBinaryProtocol.TBinaryProtocol(transport) client = ThriftHive.Client(protocol) transport.open() client.execute("CREATE TABLE r(a STRING, b INT, c DOUBLE) ROW FORMAT DELIMITED FIELDS TERMINATED BY '\\t' STORED AS TEXTFILE") client.execute("LOAD DATA LOCAL INPATH '/tmp/r.txt' OVERWRITE INTO TABLE r") client.execute("SELECT * FROM r") for row in client.fetchAll(): print row transport.close() except Thrift.TException, tx: print '%s' % (tx.message) 실행 {{{ $ python hive_py.py a 1 1.0 b 2 2.0 c 3 3.0 |
보람 있게 보낸 하루가 편안한 잠을 가져다주듯이 값지게 쓰여진 인생은 편안한 죽음을 가져다준다. (레오나르도 다빈치) |