使用现有的dataframe列pandas创建新的数据帧

时间:2018-03-04 19:15:26

标签: pandas dataframe data-science

我在数据帧中有以下数据结构。 我想创建一个以

的方式表示的新数据帧

newDF将包含列Year,DF1,DF2,DF3

列应包含W

下的数据

我做过: pd.concat([DF1 [' W'],DF2 [' W'],DF3 [' W']],轴= 1,键= [& #39; DF1',' DF2',' DF3'])

我得到了结果,但不知道如何才能获得年度数据。

DF1

gulp.task('default', gulp.parallel('MyTask001', 'MyTask002'));

DF2

    Year    Conf    W   L   T   Pct     SRS     SOS     AP Pre  AP High     AP Post     Coach(es)   Bowl
0   2017    SEC     13  2   0   0.867   22.47   7.01    15.0    2.0     2.0     Kirby Smart (13-2)  Rose Bowl-W, College Football Championship-L
1   2016    SEC     8   5   0   0.615   3.64    2.57    18.0    9.0     NaN     Kirby Smart (8-5)   Liberty Bowl-W
2   2015    SEC     10  3   0   0.769   8.98    1.83    9.0     7.0     NaN     Bryan McClendon (1-0), Mark Richt (9-3)     TaxSlayer Bowl-W
3   2014    SEC     10  3   0   0.769   18.84   5.07    12.0    6.0     9.0     Mark Richt (10-3)   Belk Bowl-W
4   2013    SEC     8   5   0   0.615   12.82   7.59    5.0     5.0     NaN     Mark Richt (8-5)    Gator Bowl-L

DF3

    Year    Conf    W   L   T   Pct     SRS     SOS     AP Pre  AP High     AP Post     Coach(es)   Bowl
0   2017    Big Ten     8   5   0   0.615   13.44   6.98    11.0    7.0     NaN     Jim Harbaugh (8-5)  Outback Bowl-L
1   2016    Big Ten     10  3   0   0.769   17.56   4.79    7.0     2.0     10.0    Jim Harbaugh (10-3)     Orange Bowl-L
2   2015    Big Ten     10  3   0   0.769   16.34   4.57    NaN     12.0    12.0    Jim Harbaugh (10-3)     Citrus Bowl-W
3   2014    Big Ten     5   7   0   0.417   1.82    3.65    NaN     NaN     NaN     Brady Hoke (5-7)    NaN
4   2013    Big Ten     7   6   0   0.538   5.53    3.30    17.0    11.0    NaN     Brady Hoke (7-6)    Buffalo Wild Wings Bowl-L

结果我GOT: 当我这样做时: pd.concat([DF1 [' W'],DF2 [' W']],轴= 1,键= [' DF1',' DF2'])

如何在数据框中获得年份。

    Year    Conf    W   L   T   Pct     SRS     SOS     AP Pre  AP High     AP Post     Coach(es)   Bowl    Unnamed: 13
0   2017    ACC     7   6   0   0.538   8.07    5.76    3.0     3.0     NaN     Jimbo Fisher (5-6), Odell Haggins (2-0)     Independence Bowl-W     NaN
1   2016    ACC     10  3   0   0.769   15.01   6.16    4.0     2.0     8.0     Jimbo Fisher (10-3)     Orange Bowl-W   NaN
2   2015    ACC     10  3   0   0.769   13.59   1.97    10.0    9.0     14.0    Jimbo Fisher (10-3)     Peach Bowl-L    NaN
3   2014    ACC     13  1   0   0.929   14.48   5.13    1.0     1.0     5.0     Jimbo Fisher (13-1)     Rose Bowl-L     NaN
4   2013    ACC     14  0   0   1.000   23.36   1.29    11.0    1.0     1.0     Jimbo Fisher (14-0)     BCS Championship-W  NaN

感谢您的帮助     4 8 7

1 个答案:

答案 0 :(得分:1)

对于来自index的列,我认为需要set_indexconcatreset_index {:

df = pd.concat([DF1.set_index('Year')['W'], 
                DF2.set_index('Year')['W'], 
                DF3.set_index('Year')['W']], axis=1, keys=['DF1', 'DF2','DF3']).reset_index()

使用list comprehension的另一个更动态的解决方案:

dfs = [x.set_index('Year')['W'] for x in [DF1,DF2,DF3]]
df = pd.concat(dfs, axis=1, keys=['DF1', 'DF2','DF3']).reset_index()
print (df)
   Year  DF1  DF2  DF3
0  2017   13  Ten    7
1  2016    8  Ten   10
2  2015   10  Ten   10
3  2014   10  Ten   13
4  2013    8  Ten   14