如何总结多个文件中特定列的值

时间:2019-07-07 10:21:52

标签: python python-3.x jupyter-notebook

我有多个数据文件。它们全部包含4列。我想将第三列中的每个值与之前所有文件中的相应值相加(第四列中的值相同)。第一和第二列需要保持不变。最后,将最终值保存在单独的输出文件中。 我已经写下了代码,但我不知道该如何进一步发展以获得自己想要的东西。

```
import glob
import numpy as np
# Reading the inputs
path = r'C:\Users\hp\Desktop\test\vdfi-0**-01000000'
my_files = glob.glob(path)
#print(len(my_files))
# opening an Output file
f=open(r'C:\Users\hp\Desktop\test\vdfi.txt',"a+")
#
x = 0 
for files in my_files:
    FR=open(files,'r')
    arr=np.loadtxt(FR.name)
    Vpara=arr[:,0]; Vperp=arr[:,1];F=arr[:,2]; dF=arr[:,3]
#    F[:,i]#+=F[i+1]
    print(F)
    for i in F:
        print(i+(i+1))
#    print(F[:])
```

这只是我的输入文件外观的两个示例。

vdfi-000-01000000
     -0.2900E+00      0.5000E-02      3.0000E+00      2.0000E+00
     -0.2900E+00      0.1000E-01      5.0000E+00      3.0000E+00
     -0.2900E+00      0.1500E-01      7.0000E+00      4.0000E+00
     -0.2900E+00      0.2000E-01      9.0000E+00      5.0000E+00
     -0.2900E+00      0.2500E-01      1.1000E+01      6.0000E+00
     -0.2900E+00      0.3000E-01      0.0000E+00      7.0000E+00
     -0.2900E+00      0.3500E-01      0.0000E+00      0.0000E+00
     -0.2900E+00      0.4000E-01      0.0000E+00      0.0000E+00
     -0.2900E+00      0.4500E-01      0.0000E+00      0.0000E+00
     -0.2900E+00      0.5000E-01      0.0000E+00      0.0000E+00
      ...             ...             ...             ...

vdfi-001-01000000
     -0.2900E+00      0.5000E-02      2.0000E+00      8.0000E+00
     -0.2900E+00      0.1000E-01      4.0000E+00      3.1000E+00
     -0.2900E+00      0.1500E-01      6.0000E+00      6.0000E+00
     -0.2900E+00      0.2000E-01      8.0000E+00      4.0000E+00
     -0.2900E+00      0.2500E-01      1.0000E+01      4.0000E+00
     -0.2900E+00      0.3000E-01      0.0000E+00      1.0000E+00
     -0.2900E+00      0.3500E-01      0.0000E+00      0.0000E+00
     -0.2900E+00      0.4000E-01      0.0000E+00      0.0000E+00
     -0.2900E+00      0.4500E-01      0.0000E+00      0.0000E+00
     -0.2900E+00      0.5000E-01      0.0000E+00      1.0000E+00
      ...             ...             ...             ...

The expected output for just these two files would be:
vdfi.txt
     -0.2900E+00      0.5000E-02      5.0000E+00      1.0000E+01
     -0.2900E+00      0.1000E-01      9.0000E+00      6.1000E+00
     -0.2900E+00      0.1500E-01      1.3000E+01      1.0000E+01
     -0.2900E+00      0.2000E-01      1.7000E+01      9.0000E+00
     -0.2900E+00      0.2500E-01      2.1000E+01      1.0000E+01
     -0.2900E+00      0.3000E-01      0.0000E+00      8.0000E+00
     -0.2900E+00      0.3500E-01      0.0000E+00      0.0000E+00
     -0.2900E+00      0.4000E-01      0.0000E+00      0.0000E+00
     -0.2900E+00      0.4500E-01      0.0000E+00      0.0000E+00
     -0.2900E+00      0.5000E-01      0.0000E+00      1.0000E+00
      ...             ...             ...             ...

现在将这两个扩展到100多个文件。结果,我只需要一个文件就可以根据每个值在其他文件上的对应值来包含每个值的所有先前文件的总和。任何建议表示赞赏。

1 个答案:

答案 0 :(得分:0)

感谢我的好朋友,我可以找到答案。我正在写下答案,因此对其他人也有帮助。

import pandas as pd
import os
import numpy as np
path = "C:\\Users\\hp\\Desktop\\test\\New folder\\"
#print(path)
final_df = None
for file in os.listdir(path):
#    print(file)
    df = pd.read_csv(path+file, header = None, sep = '     ',engine = 'python')  
    #Convert to numeric
    for col in df:
        df[col] = pd.to_numeric(df[col])
    if final_df is None:
        final_df = df.copy()
    else:
        final_df[2] = final_df[2]+df[2] 
        final_df[3] = final_df[3]+df[3] 
np.savetxt('C:\\Users\\hp\\Desktop\\test\\vdf1.txt', final_df, fmt='%16.4e' , newline="\r\n")