这是原始WordCount.java代码的一部分。
public static void main(String[] args) throws Exception {
// set up the execution environment
final ExecutionEnvironment env = ExecutionEnvironment.getExecutionEnvironment();
// get input data
DataSet<String> text = env.fromElements(
"To be, or not to be,--that is the question:--",
"Whether 'tis nobler in the mind to suffer",
"The slings and arrows of outrageous fortune",
"Or to take arms against a sea of troubles,"
);
//DataSet<String> text = env.readTextFile("file:///home/jypark2/data3.txt");
DataSet<Tuple2<String, Integer>> counts =
// split up the lines in pairs (2-tuples) containing: (word,1)
text.flatMap(new LineSplitter())
// group by the tuple field "0" and sum up tuple field "1"
.groupBy(0)
.sum(1);
// execute and print result
counts.print();
}
我想从文本文件中读取,所以 我改变了这段代码。
public static void main(String[] args) throws Exception {
// set up the execution environment
final ExecutionEnvironment env = ExecutionEnvironment.getExecutionEnvironment();
// get input data
DataSet<String> text = env.readTextFile("file:///home/jypark2/data3.txt");
DataSet<Tuple2<String, Integer>> counts =
// split up the lines in pairs (2-tuples) containing: (word,1)
text.flatMap(new LineSplitter())
// group by the tuple field "0" and sum up tuple field "1"
.groupBy(0)
.sum(1);
// execute and print result
counts.print();
}
但是存在运行时错误。 但我无法解决这个问题。
为什么会发生这种情况?我该如何解决?
答案 0 :(得分:0)
如果在大规模并行设置(100多个并行线程)中运行Flink,则需要通过config参数taskmanager.network.numberOfBuffers调整网络缓冲区的数量。根据经验,缓冲区的数量应至少为4 * numberOfTaskManagers * numberOfSlotsPerTaskManager ^ 2。有关详细信息,请参阅配置参考。
来自Flink常见问题解答:https://flink.apache.org/faq.html#i-get-an-error-message-saying-that-not-enough-buffers-are-available-how-do-i-fix-this