我无法使用此脚本:
raw = LOAD 's3://xxxxxxxxx/*' AS (name:chararray, year:float, occurrences:float, books:float);
B = GROUP raw BY name;
C = FOREACH B GENERATE B.name, (SUM(B.occurrences) / SUM(B.books)) AS average;
D = ORDER C BY average DESC;
E = LIMIT D 10;
STORE E INTO 's3://xxxxxx';
答案 0 :(得分:0)
声明C
不正确,您无法使用name,occurrences and books
访问变量Relation B
。这应该只能由relation raw
访问。你能改变你的stmt C
这样的东西吗?
C = FOREACH B GENERATE group, SUM(raw.occurrences)/SUM(raw.books) AS average;
此处group
引用变量name
如果您遇到任何其他问题,请粘贴您的错误日志。