我在Hive中创建了一个表,并从外部csv文件加载了数据。当我尝试从python打印数据时,我得到的输出如“['\ x00”\ x00m \ x00e \ x00s \ x00s \ x00a \ x00g \ x00e \ x00“\ x00']”。当我查询Hive GUI时,结果是正确的。请告诉我如何通过python程序获得相同的结果。
我的python代码:
import pyhs2
with pyhs2.connect(host='192.168.56.101',
port=10000,
authMechanism='PLAIN',
user='hiveuser',
password='password',
database='anuvrat') as conn:
with conn.cursor() as cur:
cur.execute('SELECT message FROM ABC_NEWS LIMIT 5')
print cur.fetchone()
输出是:
/usr/bin/python2.7 /home/anuvrattiku/SPRING_2017/CMPE239/Facebook_Fake_news_detection/code_fake_news/code.py
['\x00"\x00m\x00e\x00s\x00s\x00a\x00g\x00e\x00"\x00']
Process finished with exit code 0
当我在Hive中查询同一个表时,我得到以下输出:
这就是我创建表格的方式:
CREATE TABLE ABC_NEWS(
ID STRING,
PAGE_ID INT,
NAME STRING,
MESSAGE STRING,
DESCRIPTION STRING,
CAPTION STRING,
POST_TYPE STRING,
STATUS_TYPE STRING,
LIKES_COUNT SMALLINT,
COMMENTS SMALLINT,
SHARES_COUNT SMALLINT,
LOVE_COUNT SMALLINT,
WOW_COUNT SMALLINT,
HAHA_COUNT SMALLINT,
SAD_COUNT SMALLINT,
THANKFUL_COUNT SMALLINT,
ANGRY_COUNT SMALLINT,
LINK STRING,
IMAGE_LINK STRING,
POSTED_AT STRING
)
ROW FORMAT DELIMITED FIELDS TERMINATED BY "," ESCAPED BY '\\';
用于加载表的csv文件位于以下路径中: https://www.dropbox.com/s/fiwygyqt8u9eo5s/abc-news-86680728811.csv?dl=0