对文件夹中的多个文件运行Python脚本

时间:2018-07-20 19:57:42

标签: python pandas csv

我在一个文件夹中有多个csv文件(file1,file2,file3,file4,file5,...)

我只知道如何导入一个文件,运行命令并输出转换后的文件,如下面的代码所示。我想一次在多个csv文件中运行命令。有人可以帮忙吗?

convert.py:

import pandas as pd
import numpy as np

#read file
df = pd.read_csv("file1.csv")

#make conversion
df['Time taken'] = pd.to_datetime(df['Time taken'])
df['Time taken'] = df['Time taken'].dt.hour + df['Time taken'].dt.minute / 60

#output file
df.to_csv('file1_converted.csv', index = False)

我从如下所示的代码开始,但是它仅从一个随机的csv文件提供了一个输出(* .csv)。我想要每个文件的单独输出。

import glob
import pandas as pd
import numpy as np

files = glob.glob('folder/*.csv')
for file in files:
    df = pd.read_csv(file)

#make conversion
df['Time taken'] = pd.to_datetime(df['Time taken'])
df['Time taken'] = df['Time taken'].dt.hour + df['Time taken'].dt.minute / 60

#output file
df.to_csv('*.csv', index = False)

4 个答案:

答案 0 :(得分:2)

缩进执行数据帧转换的代码,并将其包含在for循环中,如下所示:

rom sys import argv

script, user_name = argv
prompt = '>'


print ("Hi %s, I'm the $s script.") % user_name, script
print ("I'd like to ask you a few questions.")
print ("Do you like %s?") % user_name  
likes = raw_input(prompt)

print ("Where do you live %s") % user_name
lives = raw_input(prompt)

print """
(Alright, so you said %r about liking me.
You live in %r. Not sure where that is.
 And you have a %r computer. Nice)
 """& (likes, lives, computer)

答案 1 :(得分:1)

您只需要缩进文件写入代码,即可在循环内执行,否则只会写入最后一个文件:

import glob
import pandas as pd
import numpy as np

files = glob.glob('folder/*.csv')
for file in files:
    df = pd.read_csv(file)

    #make conversion
    df['Time taken'] = pd.to_datetime(df['Time taken'])
    df['Time taken'] = df['Time taken'].dt.hour + df['Time taken'].dt.minute / 60

    #output file
    df.to_csv('*.csv', index = False)

答案 2 :(得分:1)

total_no_file=10

for i in range(total_no_file):
    file_name="file"+str(i+1)
    df = pd.read_csv(file_name)

    #make conversion
    df['Time taken'] = pd.to_datetime(df['Time taken'])
    df['Time taken'] = df['Time taken'].dt.hour + df['Time taken'].dt.minute / 60

    file_name="file"+str(i+1)+"_converted"
    df.to_csv('file1_converted.csv', index = False)

答案 3 :(得分:1)

所以您的代码有两个问题。首先,所有的缩进都搞砸了,所以for循环只将不同的csv文件读入同一变量。 其次,您应该给要写入磁盘的转换后的csv文件取一个不同的名称。因此,以下应为您工作:

{{1}}