FileNotFoundException:文件文件:/path/to/file/in.txt不存在或运行Flink的用户没有足够的权限来访问它

时间:2017-05-13 00:59:37

标签: python apache-flink

我尝试使用flink和python batch api测试Wordcount经典示例。 我的问题是,在将数据源从env.from_elements()修改为env.read_text()后(对于更大的测试用例),会发生错误。以下代码描述了我的实现。

[...]
if __name__ == "__main__":
env = get_environment()
input_file = 'file:///workfile.txt/'

if len(sys.argv) != 1 and len(sys.argv) != 3:
    sys.exit("Usage: ./bin/pyflink.sh WordCount[ - <text path> <result path>]")

if len(sys.argv) == 3:
    data = env.read_text(sys.argv[1])
else:
    #data = env.from_elements("hello","world","hello","car","tree","data","hello","car","tree","data","hello","car","tree","data","hello","car","tree","data","hello","car","tree","data","hello","car","tree","data","hello","car","tree","data","hello","car","tree","data","hello","car","tree","data","hello","car","tree","data","hello","car","tree","data","hello","car","tree","data","hello","car","tree","data","hello","car","tree","data","hello","car","tree","data","hello","car","tree","data","hello","car","tree","data","hello","car","tree","data","hello","car","tree","data","hello","car","tree","data","hello","car","tree","data","hello","car","tree","data","hello","car","tree","data","hello","car","tree","data","hello","car","tree","data","hello","car","tree","data","hello","car","tree","data","hello","car","tree","data","hello","car","tree","data","hello","car","tree","data","hello")
    data = env.read_text(input_file)

result = data \
    .flat_map(Tokenizer()) \
    .group_by(1) \
    .reduce_group(Adder(), combinable=True) \

if len(sys.argv) == 3:
    result.write_csv(sys.argv[2])
else:
    result.output()
[...]

执行上面的代码,抛出文件权限错误。更具体地说,以下消息

引起:org.apache.flink.runtime.JobException:创建输入拆分导致错误:文件文件:/workfile.txt不存在或运行Flink的用户(&#39;用户&#39;)访问权限不足。

PS:搜索了一个解决方案,但找不到任何东西。如果这个问题已经解决了,我会很感激重定向。

1 个答案:

答案 0 :(得分:2)

我认为&#34; workfile.txt&#34;应该是一条相对的道路。但是,您不能拥有带有方案的相关文件(&#34; file:///&#34;)。

请提供完整的绝对路径,它应该有效。

请注意,当我们在临时位置执行脚本时,相对路径通常不会与Python API一起工作。