包找不到com.google.cloud.dataflow.sdk

时间:2017-01-10 17:40:07

标签: google-cloud-dataflow apache-beam

我的数据流Java代码在从Maven执行时停止编译/运行:

v = df.values
n, m = v.shape
i = df.index.values
c = df.columns.values

# create series with values that were column values
# create multi index with first level from existing index
# and second level from flattened existing values
# then unstack
pd.Series(
    np.tile(c, n),
    [i.repeat(m), v.ravel()]
).unstack()

  Bob Cat Dov Edd
0  a1  a2  a3  a4
1  a3  a1  a2  a4
2  a4  a2  a3  a1

1 个答案:

答案 0 :(得分:1)

您的pom.xml可能包含以下行:

[ERROR] symbol:   class Pipeline
[ERROR] location: package com.google.cloud.dataflow.sdk
[ERROR] ... package com.google.cloud.dataflow.sdk.io does not exist
[ERROR] ... package com.google.cloud.dataflow.sdk.options does not exist
[ERROR] ... package com.google.cloud.dataflow.sdk.transforms does not exist

这引起了对2.0-beta的重大变化,其中Google Dataflow软件包名称已更改为org.apache.beam。

现在,将pom.xml设置更改为:

<dependency>
  <groupId>com.google.cloud.dataflow</groupId>
  <artifactId>google-cloud-dataflow-java-sdk-all</artifactId>
  <version>[1.6.0, 2.0.0)</version>
</dependency>

准备好后,请按照

中的说明操作

https://cloud.google.com/dataflow/release-notes/release-notes-java-2

更新您的Java代码。它不仅仅是改变包名。您的跑步者名称将会更改,DoFns还需要@ProcessElement注释以及其他更改。