如何限制每个循环在python中读取多个带引号的字符串?

时间:2018-12-11 07:55:43

标签: python list csv foreach reader

CSV文件包含以下数据,

  

“ 1111”“,” 2222“ 2222”,“ 3333、33、33”,“ 444”,“”

csv阅读器将数据读取为

  

['“” 1111“”','“ 2222” 2222“','” 3333“,'33','33”','“ 444”','“”']

当我尝试将此阅读器对象转换为list并尝试对每个循环使用它进行迭代时,循环会将“ 3333、33、33”视为三个不同的值。我的要求是将其读取为单个字符串。

代码:

reader = csv.reader(csv_file, delimiter=',',  quotechar="'", escapechar = "'")
       for row in reader:
           colValues = list(row)
           print(colValues)
           for each in colValues:
               print(each)

当前输出:

"1111""
"2222"2222"
"3333
 33
 33"
"444"
""

所需的输出:

"1111""
"2222"2222"
"3333, 33, 33"
"444"
""

2 个答案:

答案 0 :(得分:0)

考虑到输入字符串,没有csv库的解决方法:

input = '"1111"","2222"2222","3333, 33, 33","444",""'

这将返回所需的输出:

res = input.split(",\"")
for i, e in enumerate(res):
  if len(e) > 1 and e[0] != '"' or len(e) == 1:
    res[i] = '"' + e


for e in res:
  print (e)

# "1111""
# "2222"2222"
# "3333, 33, 33"
# "444"
# ""

但是我不知道它是否适用于文件的所有行。

答案 1 :(得分:0)

我认为csv模块不能处理这种不规则格式。

您可以根据","进行拆分,以获取正确的列。您还需要去除第一和最后一个报价。

>>> row = '"1111"","2222"2222","3333, 33, 33","444",""'
>>> row = row[1:-1]
>>> print(row)
1111"","2222"2222","3333, 33, 33","444","

>>> row.split('","')
['1111"', '2222"2222', '3333, 33, 33', '444', '']

一起:

with open(csv_file) as lines:
    for line in lines:
        line = line.rstrip()  # need to get rid of newline
        for element in line[1:-1].split('","'):
            print(element)

输出:

1111"
2222"2222
3333, 33, 33
444