读取文本文件链接

时间:2019-04-01 20:15:12

标签: apache-flink

嗨,我是Flink的新手,我正在尝试读取文本文件,当我打印该文件时,它似乎未排序。

那正常吗?为什么不订购?有办法排序吗?

原始文本文件是:

01-06-2018,June,Category5,Bat,12
01-06-2108,June,Category4,Perfume,10
13-07-2018,July,Category1,Television,50
24-06-2018,June,Category4,Shirt,38
18-06-2018,June,Category5,Bat,41
01-08-2018,August,Category5,PC,32
11-06-2018,June,Category2,Laptop,39
04-06-2018,June,Category1,PC,14
26-08-2018,August,Category4,Pendrive,42
12-06-2018,June,Category2,Tablet,41
25-08-2018,August,Category1,Shirt,34
17-07-2018,July,Category5,Steamer,27
....

我的代码: val env = ExecutionEnvironment.getExecutionEnvironment val dataset = env.readTextFile("text.txt") dataset.print()

例如,每次执行时,我都会以不同的顺序获得代码行,例如:

31-08-2018,August,Category2,Jewelry,38
13-08-2018,August,Category2,Mouse,35
02-07-2018,July,Category3,PC,34
04-08-2018,August,Category2,Bottle,38
04-06-2018,June,Category1,Pendrive,30
24-08-2018,August,Category1,Phone,43
11-06-2018,June,Category4,Jeans,14
28-08-2018,August,Category3,Jeans,36
14-06-2018,June,Category1,Bottle,49
....

为帮助您理解并尽可能将其整理为序的任何帮助

谢谢!

1 个答案:

答案 0 :(得分:0)

尝试将并行度设置为1,flink本地模式会将并行度设置为cpu内核的数量,这将导致未排序的输出。