我有一个具有这种结构的表:
user_id | message_id | content
1 | 1 | "I like cats"
1 | 1 | "I like dogs"
dictionary.txt
(或外部配置单元表)中的有效单词列表,例如:
I,like,dogs,cats,lemurs
我的目标是为每个用户生成一个单词计数表
user_id | "I" | "like" | "dogs" | "cats" | "lemurs"
1 | 2 | 2 | 1 | 1 | 0
SELECT user_id, word, COUNT(*)
FROM messages LATERAL VIEW explode(split(content, ' ')) lTable as word
GROUP BY user_id,word;
答案 0 :(得分:1)
我不太熟悉在Hive上做Pivot,但是在猪身上它可以做到。
DEFINE GET_WORDCOUNTS com.stackoverflow.pig.GetWordCounts('$dictionary_path');
A = LOAD .... AS user_id, message_id, content;
C = GROUP B BY (user_id);
D = FOREACH C GENERATE group, FLATTEN(GET_WORDCOUNTS(B.content));
您必须编写一个简单的UDF GetWordCounts
,它会为每个分组记录标记输入内容,并使用输入字典进行检查。
答案 1 :(得分:1)
检查一下:
function pageback() {
//assuming names of section to be currentsection & previoussection
currentsection.hide();
previoussection.show();
}
否则,您可以定义一个变量(您的搜索字符串)并将其放在' A',' W'等