根据索引选择固定的数据帧范围并附加到新的数据帧

时间:2016-01-09 10:06:11

标签: python pandas dataframe

我有一个数据框:

resultsDf

返回以下内容:

       0
0     100
1   -2800
2   -2800
3   -2800
0   -2800
1   -2800
2   -2900
3   -3000
0   -3000
1   -3000
2   -3000
3   -3000
0   -3000
1   -3000
2   -3000
3   -3000
.  
.  
.   
0  -3100
1  25500

我想根据索引提取数据帧的固定子集,即0,1,2,3

然后,我想以列格式将每个帧添加到新的数据帧。因此,最终数据框应如下所示:

  C1   C2   C3.....Cn
0
1
2
3

1 个答案:

答案 0 :(得分:2)

您可以按valuesreshape将df转换为numpy数组。然后,您可以使用range列表推导设置新列名称:

print resultsDf

      0
0   100
1 -2800
2 -2800
3 -2800
0 -2800
1 -2800
2 -2900
3 -3000
0 -3000
1 -3000
2 -3000
3 -3000
0 -3000
1 -3000
2 -3000
3 -3000

df = pd.DataFrame((resultsDf.values).reshape((4, (resultsDf.values).shape[0]/4)))
df.columns = ['C' + str(i) for i in range(1, len(df.columns) + 1) ]

print df

     C1    C2    C3    C4
0   100 -2800 -2800 -2800
1 -2800 -2800 -2900 -3000
2 -3000 -3000 -3000 -3000
3 -3000 -3000 -3000 -3000

如果缺少最后一行(不将索引作为其他行重复):

print resultsDf

        0
0     100
1   -2800
2   -2800
3   -2800
4   -2800
5   -2800
6   -2900
7   -3000
8   -3000
9   -3000
10  -3000
11  -3000
12  -3000
13  -3000
14  -3000
15  -3000
0     100
1   -2800
2   -2800
3   -2800
4   -2800
5   -2800
6   -2900
7   -3000
8   -3000
9   -3000
10  -3000
11  -3000
12  -3000
13  -3000
14  -3000
15  -3000
0   -3100
1   25500
#use all df without last two rows - resultsDf[:-2]
df = pd.DataFrame((resultsDf[:-2].values).reshape(16, resultsDf[:-2].values.shape[0]/16))
#append last two rows to new df
df = pd.concat([df, resultsDf[-2:]], axis=1)
df.columns = ['C' + str(i) for i in range(1, len(df.columns) + 1) ]

print df

      C1    C2     C3
0    100 -2800  -3100
1  -2800 -2800  25500
2  -2800 -2800    NaN
3  -2900 -3000    NaN
4  -3000 -3000    NaN
5  -3000 -3000    NaN
6  -3000 -3000    NaN
7  -3000 -3000    NaN
8    100 -2800    NaN
9  -2800 -2800    NaN
10 -2800 -2800    NaN
11 -2900 -3000    NaN
12 -3000 -3000    NaN
13 -3000 -3000    NaN
14 -3000 -3000    NaN
15 -3000 -3000    NaN