Question

我目前正在使用tensorflow开发python项目，我需要预处理数据。

我要使用的数据存储在带有列的sqlite3数据库中：

timestamp|dev|event
10:00    |01 | on
11:00    |02 | off
11:15    |01 | off
11:30    |02 | on

我想将数据导出到如下所示的文件（.csv）中：

Timestamp|01 |02 |...
10:00    |on |0  |...
11:00    |on |off|...
11:15    |off|off|...
11:30    |off|on |...

哪个始终具有与当前时间戳相关联的每个设备的最新信息，以及与每个新时间戳相关的旧值，并且如果存在更新，则仅应更新这些值。设备的数量没有变化，我可以通过

找到该数量

SELECT COUNT(DISTINCT dev) FROM table01;

当前，“号码”是38个不同的设备，共有10000个条目。

是否有使用sqlite3进行此计算的方法，还是必须用python编写程序来处理数据。我对这两个主题都是新手。

〜法比安

Answer 1

您可以在sqlite中使用它，

public Response getFiles(@PathParam("userId") String userId) throws  GeneralSecurityException, IOException  {
    Drive service = GoogleServices.getDriveService(userId);

    Files.List list = service.files().list()
    .setOrderBy("name")
    .setQ("mimeType = 'application/vnd.google-apps.site'")
    .setFields("nextPageToken, files(*)")
    .setPageSize(100);

    for (File fileObj : files.getFiles()) {
       System.out.println("fileobj:" + fileObj.toPrettyString());
    }
}

public static Drive getDriveService(String userEmail) throws GeneralSecurityException,IOException {
  HttpTransport httpTransport = new NetHttpTransport();
  JacksonFactory jsonFactory = new JacksonFactory();
  GoogleCredential credential = new GoogleCredential.Builder()
      .setTransport(httpTransport)
      .setJsonFactory(jsonFactory)
      .setServiceAccountId(SERVICE_ACCOUNT_EMAIL)
      .setServiceAccountScopes(Arrays.asList(DriveScopes.DRIVE))
      .setServiceAccountUser(userEmail)
      .setServiceAccountPrivateKeyFromP12File(
          new java.io.File(SERVICE_ACCOUNT_PKCS12_FILE_PATH))
      .build();

  Drive service = new Drive.Builder(httpTransport, jsonFactory, null)
    .setApplicationName("Drive Service")
    .setHttpRequestInitializer(credential).build();
  return service;
}

基本上，您正在旋转桌子。

挑战是存在的，因为枢轴需要是动态的，即设备列表不固定。您需要查询设备列表，然后构建此查询，例如，当其他部分基于设备列表时。

此外，通常您需要基于时间戳进行分组，因为对于单个时间戳，不同设备的设备状态将位于不同的行中。

如果{timestamp，device}不是唯一的，则需要使其唯一。

SQLite3导出数据

1 个答案: