GAE bulkloader csv分隔符错误

时间:2013-07-02 04:36:29

标签: google-app-engine google-cloud-datastore yaml bulkloader

我有以下格式的文件

name- stuffinside -description name-stuffinside -description “ame- stuffinside -description

我将以下位作为我的bulkloader代码:

ransformers: 
- kind: storeItem 
connector: csv 
connector_options: 
encoding: utf-8 
column_list: [name, stuffinside,description]
 property_map:
 - property: key
 external_name: name export_transform: transform.key_id_or_name_as_string

- property: name
  external_name: name

- property: stuffinside
  external_name: stuffinside
  import_transform: "lambda x: x.split('-')"

- property: description
  external_name: description

我遇到的问题是我无法读取文件 用“ - ”符号拆分它,有3个不同的部分。我希望它像

name = x[0]
stuffinside =x[1]
description = x[2]

从文件中读取哪些内容我没有问题,但我不知道这个应用程序引擎的批量加载格式是怎么做的。关于我做错了什么想法?

1 个答案:

答案 0 :(得分:0)

您确定“ - ”是用作字段分隔符的好符号吗?内容似乎很可能包含该符号并搞砸了你的领域。

首先将“ - ”转换为制表符,然后按照此示例操作:

来自http://bulkloadersample.appspot.com/showfile/bulkloader.yaml

# A sample using a TSV file with no header, specifying the columns here.
- model: models.Visit
  connector: csv
  connector_options:
    encoding: windows-1252
    # TSV is specified using an extra parameter of the Python csv module.
    import_options:
      dialect: excel-tab
    export_options:
      dialect: excel-tab
    # Columns here are a sequence in YAML, so can be specified in either block
    # or flow style. This is short enough that I'll use flow style.
    column_list: [visitid, customer, date, score, activities]
  property_map:
    - property: __key__
      external_name: visitid
      export_transform: datastore.Key.name
    - property: customer
      external_name: customer
      import_transform: transform.create_foreign_key('Customer')
      export_transform: datastore.Key.name
    - property: visit_date
      external_name: date
      import_transform: transform.import_date_time('%m/%d/%Y')
      export_transform: transform.export_date_time('%m/%d/%Y')
    - property: score
      external_name: score
      import_transform: float
    - property: activities
      external_name: activities
      # This is a CSV list of strings inside the TSV file.
      import_transform: "lambda x: x.split(',')"
      export_transform: "','.join"