我的数据流Java代码在从Maven执行时停止编译/运行:
v = df.values
n, m = v.shape
i = df.index.values
c = df.columns.values
# create series with values that were column values
# create multi index with first level from existing index
# and second level from flattened existing values
# then unstack
pd.Series(
np.tile(c, n),
[i.repeat(m), v.ravel()]
).unstack()
Bob Cat Dov Edd
0 a1 a2 a3 a4
1 a3 a1 a2 a4
2 a4 a2 a3 a1
答案 0 :(得分:1)
您的pom.xml可能包含以下行:
[ERROR] symbol: class Pipeline
[ERROR] location: package com.google.cloud.dataflow.sdk
[ERROR] ... package com.google.cloud.dataflow.sdk.io does not exist
[ERROR] ... package com.google.cloud.dataflow.sdk.options does not exist
[ERROR] ... package com.google.cloud.dataflow.sdk.transforms does not exist
这引起了对2.0-beta的重大变化,其中Google Dataflow软件包名称已更改为org.apache.beam。
现在,将pom.xml设置更改为:
<dependency>
<groupId>com.google.cloud.dataflow</groupId>
<artifactId>google-cloud-dataflow-java-sdk-all</artifactId>
<version>[1.6.0, 2.0.0)</version>
</dependency>
准备好后,请按照
中的说明操作https://cloud.google.com/dataflow/release-notes/release-notes-java-2
更新您的Java代码。它不仅仅是改变包名。您的跑步者名称将会更改,DoFns还需要@ProcessElement注释以及其他更改。