Question

我写了一些Python代码来切片文本文件。但是现在文本文件非常大，使计算速度极慢。我想将代码转换为C，并希望更快地进行计算。你能帮帮我吗？这是我的Python代码

import numpy as np
from scipy import 

T = 20 # length of the output files
# Opening a file with paths to the files that need to be sliced
with open('path.txt', 'r') as paths_list:
    for file_path in paths_list:
        with open(file_path.strip(), 'r') as file:
            data = np.loadtxt(file_path.strip())
            t = data[:,0] # the first column: time
            x = data[:,1] # the second column: photon count

            # Using the name of the files as input 
            # to find the time at which the files need to be sliced
            start = file_path.strip() 
            start = start[0:-4]
            start = float(start)

            # slicing the files at the start time
            indstart_data = where(t>start)[0][0]

            # slicing at the end time 
            # which is 20 seconds after the start time
            indeind_data = where(t<start+T)[-1][-1]

            newdata=data[indstart_data:indeind_data,:]

            # using the start time to name the output files
            start = str(start)
            np.savetxt(start, newdata)

它的作用是，它需要一个文件，其中包含我想要切片的文件的路径。这些文件包含一个包含时间的列和一个包含光子计数的列。文件以输出文件必须启动的开始时间命名。

文件被切片，从开始时间开始到20秒后结束（T = 20）。此切片保存在输出文件中，输出文件也使用开始时间命名。

任何帮助表示赞赏！我或多或少想出了如何用C切片文件。这个代码的第一部分主要是我无法弄清楚如何转换，所以如何从文件包含路径，实际输入文件并使用文件名作为输入。

如果你们有建议让它跑得快一点，我真的很感激！我有几百个输入文件，其中大多数是几GB。

现在我在C中有这个非常不完整的代码：

fp=fopen("test.txt", "w");

for (int i = 0; i < unknown_numberoflines; i++)
{
    file = fopen(pathies, "r");

    if (feof(file))
        break;

    /* scan a file */
    fscanf(file, "%Lf %Lf", &(lines[i][0]), &(lines[i][1]));

    /* print contents of file */
    printf("%Lf %Lf \n", lines[i][0], lines[i][1]);
    printf("%i", i);

    /* put certain lines in a new text file */
    if ((lines[i][0]) > 20 && (lines[i][0] < 40))
    {
        fprintf(fp, "%Lf %Lf \n", lines[i][0], lines[i][1]);
    }

    fclose;   
}

fclose(file);
return 0;

将简短的Python代码转换为C以使其运行更快

0 个答案: