Question

我正在尝试拆分以下字符串：

a = "2147486448, 'node[082, 101-107]', 8"

现在我正在使用的是str(a).strip('[]').split(",")，输出是[2147486448，'node [082，''101-107]'，8]这不是我想要的，我期待的是什么是[2147486448，'node [082,101-107]'，8]

但正如你在列表的第二项中所看到的那样，它还包含一个'，'，那么我应该怎样做才能将第二项作为一项而不是由'，'

拆分

我读过这篇文章 [How to count occurrences of separator in string excluding those in quotes但我仍然不知道在我的情况下应该做些什么。非常感谢您的帮助，如果您认为这是重复的话，请随时删除此帖子

更新：

感谢@Cyphase代码有效，但是当我尝试从txt文件中逐行读取并对每行执行此操作时：

a = []
f = open(txt_file) 
for row in f: 
    a.append(ast.literal_eval(row))

txt文件的片段是：

423, 0, 0, 'default', 8, 8, 0, NULL, 1, 'sacimport', 2990, NULL, 286, 232, 0, 0, 1486, 576, -1, 98304, 'node581', 1, '476', 'batch', 4294901555, 6, 60, 1403219907, 1403219907, 1403219908, 1403223513, 0, '', '', '', '', 0

424, 0, 0, 'default', 16, 16, 0, NULL, 0, 'B35planar-2.com', 2828, NULL, 287, 130, 0, 0, 24691, 16508, 24691, 16384, 'node582', 1, '477', 'batch', 4294901554, 4, 3600, 1403219914, 1403219914, 1403219915, 1403220421, 0, '', '', '', '', 0

425, 0, 0, 'default', 2, 2, 0, NULL, 0, 'EC', 704, NULL, 288, 248, 0, 0, 1798, 702, 1798, 2147486448, 'node514', 1, '409', 'sandy-batch', 4294901553, 4, 390, 1403220027, 1403220027, 1403220027, 1403220117, 0, '', '', '', '', 0

它说ValueError：格式错误的字符串，但每行代表一个字符串吧？

Answer 1

使用csv模块 - 它处理这种事情

Answer 2

您可以使用ast.literal_eval();它是eval()的安全版本，仅评估Python文字：

>>> import ast
>>> raw = "2147486448, 'node[082, 101-107]', 8"
>>> ast.literal_eval(raw)
(2147486448L, 'node[082, 101-107]', 8)
>>>

Answer 3

ast每行都会失败，你的行没有用引号括起来，你有空行，你有一个csv文件应该用csv模块解析，你可以使用quotechar="'"来剥离单引号，你肯定需要skipinitialspace=True。

 with open("in.txt") as f:
    r = csv.reader(f, quotechar="'", skipinitialspace=True)
    for row in r:
        print(row)

输出添加2147486448, 'node[082, 101-107]', 8作为最后一行：

['423', '0', '0', 'default', '8', '8', '0', 'NULL', '1', 'sacimport', '2990', 'NULL', '286', '232', '0', '0', '1486', '576', '-1', '98304', 'node581', '1', '476', 'batch', '4294901555', '6', '60', '1403219907', '1403219907', '1403219908', '1403223513', '0', '', '', '', '', '0']
[]
['424', '0', '0', 'default', '16', '16', '0', 'NULL', '0', 'B35planar-2.com', '2828', 'NULL', '287', '130', '0', '0', '24691', '16508', '24691', '16384', 'node582', '1', '477', 'batch', '4294901554', '4', '3600', '1403219914', '1403219914', '1403219915', '1403220421', '0', '', '', '', '', '0']
[]
['425', '0', '0', 'default', '2', '2', '0', 'NULL', '0', 'EC', '704', 'NULL', '288', '248', '0', '0', '1798', '702', '1798', '2147486448', 'node514', '1', '409', 'sandy-batch', '4294901553', '4', '390', '1403220027', '1403220027', '1403220027', '1403220117', '0', '', '', '', '', '0']
[]
['2147486448', 'node[082, 101-107]', '8']

如果您不关心单引号，只需使用skipinitialspace=True：

with open("in.txt") as f:
        r = csv.reader(f, quotechar="'", skipinitialspace=True)
        for row in r:
            print(row)

Answer 4

你必须：

创建一个计数器。比如count = 0
遍历您的列表。
测试此当前项目是否有引号
如果是这样，从第一次出现报价的位置到第二次出现报价迭代字符串。（实际上，如果一个项目中只有两个引号，只有一个引号，你就会知道该怎么做）。

然后，您的答案将分配给您的计数器变量。

如何将引号中的所有内容统计为仅一个项目

4 个答案: