Question

我需要您的帮助以使用python解决此问题。我有一个名为good_files.txt的.txt文件，其中每行都是一个通向新文件的路径（good_files.txt和包含这些文件的目录在同一目录中）。在每个文件中，我必须将三列数据汇总在一起以使曲线拟合例如文件的结构是这样的。

1000.415915     225.484744      -2.012516
2.000945

215     0       0
219     0       0
222     4       0
224     70      70
226     696     696
229     999     1000
233     1001    1000
238     1001    1000

因此，我必须消除保留3列的前2行，然后消除仅保留前两列的第三列。第一列是我的x坐标，第二列是我的y坐标。使用我的x and y和错误函数erf，我必须执行curve fitting。

目前，我编写的唯一代码是用于阅读good_files.txt

    def ReadFromFile (fileName):
    sourceFile= open (fileName, 'r')
    text=[]
    for adress in sourceFile.readlines ():
        if '\n' in adress: text.append (adress [:-1])
        else: text.append (adress)
    return text
    sourceFile.close()
def WriteToFile (text):
    resultFile = open ('result.txt','w')
    for data in text:
        resultFile.write (data + '\n')
    resultFile.close()

adresses = ReadFromFile ('good_files.txt')
for adress in adresses:
    text = ReadFromFile (adress)
    WriteToFile(text)

对不起，但目前我是编码方面的菜鸟。谢谢您的帮助，伙计们<3

Answer 1

您可以使用Pandas来帮助阅读和合并csv文件。假设您的ReadFromFile函数为您提供了一个不错的文件名列表，您可以执行以下操作：

import pandas as pd

def ReadFromFile (fileName):
    sourceFile= open (fileName, 'r')
    text=[]
    for adress in sourceFile.readlines ():
        if '\n' in adress: text.append (adress [:-1])
        else: text.append (adress)
    return text
    sourceFile.close()

adresses = ReadFromFile('claro_good_files.txt')
df_list = [] # create empty list that will hold your dataframes from the seperate csv's

for adress in adresses:
    # load each file, skipping the first two rows of data, splitting columns by whitespace, and setting the column names
    df_temp = pd.read_csv(adress, skiprows=2, delim_whitespace=True, header=None, names=['x', 'y', 'remove'])

    # add the 'x' and 'y' columns to the list of data frames to combine (exclude 'remove' column)
    df_list.append(df_temp[['x', 'y']])

df = pd.concat(df_list) # Combine all the DataFrame's into one big one

这应该为您提供一个带有x和y列的数据框。

路径解析并在python中拟合

1 个答案: