从谷歌数据流输出到谷歌云端防火墙

时间:2018-03-04 15:57:05

标签: java json firebase google-cloud-firestore google-cloud-dataflow

美好的一天,在A-many努力想要掌握这项技术(数据流)之后,我设法让管道100%工作。

它的作用是将一堆CSV文件加载到管道中(来自谷歌云存储),将它们转换为"域"对象然后以JSON格式将它们保存到文件中。

我想做的是将JSON对象直接推送到数据库(google cloud firestore)。

我在此阶段应用于我的数据的最终转换是:

.apply(DatastoreIO.v1().write().withProjectId("____"));

据我所知,调用需要先前的转换才能返回一个我无法创建的实体对象

public Entity toEntity() {
    Datastore datastore = DatastoreOptions.getDefaultInstance().getService();
    Key taskKey = datastore.newKeyFactory().setKind("Task").newKey("Test");
    Entity e = Entity.newBuilder(taskKey).set("Domain", domain)
            .set("LocationOnsite", locOnSite)
            .set("Company", company).build();

    return e;
}

这将返回com.google.cloud.datastore.Entity,而不是所需的com.google.datastore.v1.Entity

我认为值得注意的是" Domain" object还包含一些其他对象的ArrayLists,例如" Emails"需要包含在数据库中。

以下是我目前拥有的示例JSON输出:

{
   "Vertical": "Business And Industrial",
   "Zip": "35229",
   "Company": "Alabama Association of Nonprofits",
   "QuantCast": "229219",
   "Twitter": "",
   "Vimeo": "",
   "LocationOnSite": "",
   "LastIndexed": "2018-02-01",
   "Pinterest": "",
   "Youtube": "",
   "TechSpend": "$250+",
   "Emails": [
      {
         "Email": "shannon@alabamanonprofits.org"
      },
      {
         "Email": "support@alabamanonprofits.org"
      },
      {
         "Email": "carla@alabamanonprofits.org"
      },
      {
         "Email": "kellie@alabamanonprofits.org"
      },
      {
         "Email": "ashley@alabamanonprofits.org"
      },
      {
         "Email": "Unknown"
      }
   ],
   "Facebook": "",
   "Google+": "",
   "Alexa": "",
   "Github": "",
   "FirstIndexed": "2011-01-03",
   "People": [
      {
         "Email": "Unknown",
         "Name": "Joshua Cirulnick"
      },
      {
         "Email": "Unknown",
         "Position": "Other",
         "Name": " Elaine Lin"
      },
      {
         "Email": "Unknown",
         "Position": "Other",
         "Name": " Terry Burkle"
      },
      {
         "Email": "Unknown",
         "Position": "Director",
         "Name": " Ashley Gilbert"
      },
      {
         "Email": "Unknown",
         "Position": "President",
         "Name": " Carol Weisman"
      },
      {
         "Email": "Unknown",
         "Position": "Csuite",
         "Name": " Shannon Ammons"
      },
      {
         "Email": "Unknown",
         "Position": "Founder",
         "Name": " Kelly McDonald"
      }
   ],
   "City": "Birmingham",
   "Telephone#s": [
      {
         "Telephone#": "+1-205-879-4712"
      },
      {
         "Telephone#": "+1-205-871-7740"
      }
   ],
   "FirstDetected": "N/A",
   "LinkedIn": "",
   "VK": "",
   "State": "AL",
   "Instagram": "",
   "Country": "US",
   "Domain": "alabamanonprofits.org",
   "LastFound": "N/A"
}

如果有人能够指出我如何有效地将这些对象带入谷歌云端火狐数据库,我会非常高兴!

1 个答案:

答案 0 :(得分:4)

您可以将数据写入Cloud Pub / Sub,这可以触发将数据写入Cloud Firestore的功能。在Google I / O 2017上有一个很好的例子,它可以做同样的事情,但使用实时数据库。

您可以在此处观看:Data Pipelines with Firebase and Google Cloud (Google I/O '17)