如何将空格分隔的数据转换为CSV格式-Python

时间:2019-04-12 14:13:12

标签: python regex

我试图在字符串(文件)的前两个空格中添加一个逗号,然后我想在第三个瞬间添加一个分号。我要解决的问题是;通过使用此RegX命令result = re.sub("\s", ",", text),它将返回text="example,text,example,"。当然,这只会用逗号替换任何空格。如何使用正则表达式执行以下示例?

示例文件

536924636   www.microsoft.com   http://www.microsoft.com/pkiops/crl/MicW
536924733   www.microsoft.com   http://www.microsoft.com/pkiops/certs/Mi
536925898   crl.microsoft.com   http://crl.microsoft.com/pki/crl/product
536924636   www.microsoft.com   http://www.microsoft.com/pkiops/crl/MicW
536924733   www.microsoft.com   http://www.microsoft.com/pkiops/certs/Mi
536925898   crl.microsoft.com   http://crl.microsoft.com/pki/crl/product
536924636   www.microsoft.com   http://www.microsoft.com/pkiops/crl/MicW
536924733   www.microsoft.com   http://www.microsoft.com/pkiops/certs/Mi

已编辑;

536924636,www.microsoft.com,http://www.microsoft.com/pkiops/crl/MicW;536924733,www.microsoft.com,http://www.microsoft.com/pkiops/certs/Mi;536925898,crl.microsoft.com,http://crl.microsoft.com/pki/crl/product(etc..);

简而言之,我试图使用Regex和Python读取文本并将其转换为CSV格式。

我该如何实现?

谢谢

2 个答案:

答案 0 :(得分:1)

text = """536924636   www.microsoft.com   http://www.microsoft.com/pkiops/crl/MicW
536924733   www.microsoft.com   http://www.microsoft.com/pkiops/certs/Mi
536925898   crl.microsoft.com   http://crl.microsoft.com/pki/crl/product
536924636   www.microsoft.com   http://www.microsoft.com/pkiops/crl/MicW
536924733   www.microsoft.com   http://www.microsoft.com/pkiops/certs/Mi
536925898   crl.microsoft.com   http://crl.microsoft.com/pki/crl/product
536924636   www.microsoft.com   http://www.microsoft.com/pkiops/crl/MicW
536924733   www.microsoft.com   http://www.microsoft.com/pkiops/certs/Mi
"""

print("%s;" % ";".join([line.strip().replace("\t", ",") for line in text.splitlines()]))

输出

536924636,www.microsoft.com,http://www.microsoft.com/pkiops/crl/MicW;536924733,www.microsoft.com,http://www.microsoft.com/pkiops/certs/Mi;536925898,crl.microsoft.com,http://crl.microsoft.com/pki/crl/product;536924636,www.microsoft.com,http://www.microsoft.com/pkiops/crl/MicW;536924733,www.microsoft.com,http://www.microsoft.com/pkiops/certs/Mi;536925898,crl.microsoft.com,http://crl.microsoft.com/pki/crl/product;536924636,www.microsoft.com,http://www.microsoft.com/pkiops/crl/MicW;536924733,www.microsoft.com,http://www.microsoft.com/pkiops/certs/Mi;

功能:join()

此函数返回一个字符串,它是按传递的顺序'seperator'.join(sequence)的字符串的串联。

编辑:

从文件读取

with open('filename.txt', 'r') as file:
    print("%s;" % ";".join([line.strip().replace("\t", ",") for line in file.readlines()]))

答案 1 :(得分:0)

(?m)[^\S\r\n]+(?=(?:\S+[^\S\r\n]*)+$)

我已经在this link上说明了代码。