附加CSV文件并指定列

时间:2018-04-12 00:25:49

标签: python pandas

我正在使用Python循环遍历CSV列表(文件位置为df)并将它们附加到一个数据帧。该脚本即将完成,但在尝试向包含df名称引用的每个数据帧添加列时遇到了麻烦。

我尝试过以下脚本的多种变体,当前的脚本正确循环遍历每个CSV但只返回1个类引用而不是全部。对此的任何帮助将不胜感激。

import pandas as pd

df = pd.read_csv('MLBPitchesvsLHH.csv') #File contains 4 columns of data - Column1=Pitch; Column2=FileName; Column3=FileLoc; Column4=Class
df.to_dict('series')

combo_df = pd.DataFrame()

for file in df.loc[ : ,"FileLoc"]: #This loop opens each file located in df
    df1 = pd.read_csv(file)  
    for pitch in df.loc[ : ,"Class"]: #This loop is supposed to add a column to df1 that includes the "Class" reference from df
        df1 = df1.assign(pitch=pitch)

    combo_df = combo_df.append(df1, ignore_index=True)

combo_df.to_csv("Pitches.csv")

1 个答案:

答案 0 :(得分:1)

根据您的说明,assigndict一起使用即可实现您的目标。

combo_df = pd.DataFrame()

for file in df.loc[ : ,"FileLoc"]: #This loop opens each file located in df
    df1 = pd.read_csv(file)
    df1=df1.assign(**dict(zip(df1["Class"].astype(str), df1["Class"].astype(str))))
    combo_df = combo_df.append(df1, ignore_index=True) 

combo_df = pd.DataFrame()

for file,pitch in zip(df.loc[ : ,"FileLoc"],df.loc[ : ,"Class"]): #This loop opens each file located in df
    df1 = pd.read_csv(file)
    df1=df1.assign(pitch=pitch)
    combo_df = combo_df.append(df1, ignore_index=True)

combo_df.to_csv("Pitches.csv")