Question

我有一个输入文件，其头部如下所示：

AdditionalCookout.create!([
  {day_id: 275, cookout_id: 71, description: "Sample text, that, is ,driving , me, crazy"},
  {day_id: 275, cookout_id: 87, description: nil},
  {day_id: 276, cookout_id: 71, description: nil},
  {day_id: 276, cookout_id: 87, description: nil},
  {day_id: 277, cookout_id: 92, description: nil},
  {day_id: 277, cookout_id: 71, description: nil},

我正在尝试将每一行解析为它自己的对象。但是，我不能用逗号分割，因为有些描述中有逗号。

从我能找到的StackOverflow帖子中尝试了这两个正则表达式行：

re.split(r', (?=(?:"[^"]*?(?: [^"]*)*))|, (?=[^",]+(?:,|$))', content[x])

和

[y.strip() for y in content[x].split(''',(?=(?:[^'"]|'[^']*'|"[^"]*")*$)''')]

然而..他们都输出

['{day_id: 275', 'cookout_id: 71, description: "Feeling ambitious? If you really want to exhaust yourself today, consider adding some additional stationary cardio."},']

Turns into:
day_id: 275
cookout_id: 71, description: "Feeling ambitious? If you really want to exhaust yourself today, consider adding some additional stationary cardio.",

我有什么想法可以解决这个问题，所以它正确地将每一行分成三个独立的部分，而不仅仅是两个部分？感谢

Answer 1

尝试使用PyYAML来解析它。在你的榜样上为我工作。 https://pypi.python.org/pypi/PyYAML。那你可以避免正则表达式的头痛。

import yaml
yaml.load('{day_id: 275, cookout_id: 71, description: "Sample text, that, is,driving , me, crazy"}')
{'cookout_id': 71,
 'day_id': 275,
 'description': 'Sample text, that, is,driving , me, crazy'}

逗号分隔，但引号内没有逗号？

1 个答案: