用条件分割字符串

时间:2017-02-12 15:43:44

标签: python regex

我试图使用Python解析我的csv文件。每行有四个元素用逗号分隔。 Eeach元素是一个字符串,但它也可能包含逗号。如果元素包含逗号,则该元素是双引号。以下示例显示了带引号和不带引号的两种不同情况:

http://data.europa.eu/esco/skill/CTC_43028,"use data extraction, transformation and loading tools","ETL|extract, transform, load","<div>Integrate information from multiple applications, created and maintained by various organisations, into one consistent and transparent data structure.</div>"
http://data.europa.eu/esco/skill/SCG.TS.1.4.m.2,support company plan,follow industry guidelines|follow organisation's vision|monitor policy implementation|support company mission,<div>Act within one&#39;s work role to advance the goals and vision of the organisation.</div>

我想要的是将每一行分成四个元素。 我尝试过使用Python的split函数,但没有成功。我想我必须使用正则表达式,但我不熟悉它。 你能帮忙吗? 非常感谢。

1 个答案:

答案 0 :(得分:2)

csv模块就是您想要的:

import csv

with open('file.csv') as f:
    r = csv.reader(f)
    for row in r:
        print row

['http...', 'transformation ...', 'ETL|ext ...', '<div>Integrate ...']
['http:...', 'support ...', 'follow ...', '<div>Act ...']

','是默认分隔符,'"'是默认的quotechar。