我是python的初学者。循环glob.glob
和np.arrange
循环存在一些问题。
我有一百个CSV文件,如下所示:
13oct_speed_1kmh.csv
13oct_speed_2kmh.csv
and others
所有文件的结构数据如下:
Distance ID
2.14 A
82.12 B
12.45 A
21.07 B
11.42 A
我要根据缓冲区消除距离:
np.arange(10,100,30)
array([10, 40, 70])
我使用了以下代码:
def buffer (value, threshold):
return (value < threshold)
files = glob.glob("13oct_speed_*.csv")
for f in files:
df = pd.read_csv(f)
for i in np.arange(10,100,30):
threshold = i
result_df = df[buffer(df["Distance"], threshold)]
csvFileName = f + 'Buffer_' + str(threshold) + ".csv"
result_df.to_csv(csvFileName, sep=",")
但是结果非常奇怪,因为循环永远不会停止(总是保存新文件)。
我的愿望输出是根据缓冲区阈值消除每个距离列文件。
我的预期输出如下:
13oct_speed_1kmh_buffer10.csv
13oct_speed_1kmh_buffer40.csv
13oct_speed_1kmh_buffer70.csv
13oct_speed_2kmh_buffer10.csv
13oct_speed_2kmh_buffer40.csv
13oct_speed_2kmh_buffer70.csv
如何解决?谢谢
答案 0 :(得分:2)
您可以省略辅助函数,并用csvFileName
更改format
以获得预期的输出,带有扩展名的文件名由os.path.splitext
返回:
import os
files = glob.glob("csv/13oct_speed_*.csv")
for f in files:
df = pd.read_csv(f)
for threshold in np.arange(10,100,30):
result_df = df[df["Distance"] < threshold]
name, extension = os.path.splitext(f)
csvFileName = "{}_Buffer{}{}".format(name, threshold, extension)
print (csvFileName)
result_df.to_csv(csvFileName)