Question

我想在下面的数据中添加逗号，但仅限于第一列。

我拥有的数据：

6.85675852     5.7928113, -99.990, -99.990,   8.083,
6.81641565     5.5877682,  10.560,   8.960,   5.465,
6.84986385     5.8423371,   7.390,   7.920,   6.026,
6.86023411     5.7104751,  16.600,  13.800,   7.311,

我想要的数据：

6.85675852,     5.7928113, -99.990, -99.990,   8.083,
6.81641565,     5.5877682,  10.560,   8.960,   5.465,
6.84986385,     5.8423371,   7.390,   7.920,   6.026,
6.86023411,     5.7104751,  16.600,  13.800,   7.311,

我已尝试使用split()并添加逗号，但我不知道如何在不弄乱格式的情况下编写其余部分。

Answer 1

使用re.sub。我不知道你的文件是否使用制表符或空格，所以我处理这两种情况都是安全的。

import re

s =\
"""
6.85675852     5.7928113, -99.990, -99.990,   8.083,
6.81641565     5.5877682,  10.560,   8.960,   5.465,
6.84986385     5.8423371,   7.390,   7.920,   6.026,
6.86023411     5.7104751,  16.600,  13.800,   7.311,
"""

s_out = re.sub('(\t|\s{5})', r',\1', s)
print(s_out)

<强>输出

6.85675852,     5.7928113, -99.990, -99.990,   8.083,
6.81641565,     5.5877682,  10.560,   8.960,   5.465,
6.84986385,     5.8423371,   7.390,   7.920,   6.026,
6.86023411,     5.7104751,  16.600,  13.800,   7.311,

Answer 2

我建议将输出写入新文件，而不是尝试覆盖现有文件

with open('path/to/input') as infile, open('path/to/output', 'w') as outfile:
    for line in infile:
        head, tail = line.split('\t',1)
        outfile.write("{},\t{}".format(head, tail))

Answer 3

您可以将输入拆分为行，然后使用正则表达式搜索并替换每行：

>>> print(text)
6.85675852     5.7928113, -99.990, -99.990,   8.083,
6.81641565     5.5877682,  10.560,   8.960,   5.465,
6.84986385     5.8423371,   7.390,   7.920,   6.026,
6.86023411     5.7104751,  16.600,  13.800,   7.311,
>>> lines = text.split('\n')
>>> modified_lines = [re.sub(r'(^\d+\.\d+)',r'\1,',line) for line in lines]
>>> print('\n'.join(modified_lines))
6.85675852,     5.7928113, -99.990, -99.990,   8.083,
6.81641565,     5.5877682,  10.560,   8.960,   5.465,
6.84986385,     5.8423371,   7.390,   7.920,   6.026,
6.86023411,     5.7104751,  16.600,  13.800,   7.311,

在文本文件的特定列中添加一些字符串

3 个答案: