Question

我想连接两个csv文件file1.csv和file2.csv

file1.csv（第一行）：

6.365055485717639923e+10,6.365055501027899170e+10

file2.csv（第一行）：

153.1,0,0,0,0,0,0,5,1,0,4,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0

我希望得到以下结果：

6.365055485717639923e+10,6.365055501027899170e+10,153.1,0,0,0,0,0,0,5,1,0,4,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0

执行以下代码时：

import pandas as pd
X = pd.read_csv('Baseline_X_reduced.csv', header=None, sep=',')
Y = pd.read_csv('Baseline_X_reduced2.csv', header=None, sep=',')
Z = pd.concat([Y, X], axis=1)
Z.to_csv('Baseline_X_revised.csv', header=None, sep=',', index=False)

我得到了以下结果：

**63650554857.17639,63650555010.27899**,153.1,0,0,0,0,0,0,5,1,0,4,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0

我以粗体丢失了两个值的信息。

有没有办法用相同的格式保存file1.csv（6.365055485717639923e + 10,6.365055501027899170e + 10）中的相同值？

感谢您的帮助，

Answer 1

如果您不想转换值，请不要转换它们！ pandas模块非常适合处理浮点值，但已知浮点值没有精确的表示。

但是csv模块可以将值作为文本处理：

import csv

#open 2 input files and 1 output file
with open('Baseline_X_reduced.csv') as fd1, open('Baseline_X_reduced2.csv') as fd2:
    with open('Baseline_X_revised.csv', 'w') as fdout:
        # setup csv accessors for all files
        rd1 = csv.reader(fd1)
        rd2 = csv.reader(fd2)
        wr = csv.writer(fdout)
        while True:
            try:
                # combine lines...
                row1 = next(rd1)
                row2 = next(rd2)
                wr.writerow(row1 + row2)
            except StopIteration:
                # and stop once the shorter input file is exhausted
                break

此代码将逐行处理文件，因此即使文件大小大于可用内存也可以使用

Answer 2

我不认识熊猫，所以我无法帮助Pandas＆＃39;相关问题。但是，如果您想从文本处理问题中解决这个问题，这是一个直接的解决方案：

with open('file1.csv') as input1, \
        open('file2.csv') as input2, \
        open('Baseline_X_revised.csv', 'w') as output:

    for line1, line2 in zip(input1, input2):
        line1 = line1.rstrip()
        line2 = line2.rstrip()
        output.write('{},{}\n'.format(line1, line2))

备注

此解决方案的一个问题是剥离任何尾随空格或标签

解决方案假设两个文件的行数相同

Python：改进2个csv文件的concat结果

2 个答案: