我在Pandas中有这种数据框:
NaN
1
NaN
452
1175
12
NaN
NaN
NaN
145
125
NaN
1259
2178
2514
1
另一方面,我有其他数据框:
1
2
3
4
5
6
我想将第一个分成不同的子数据帧,如下所示:
DataFrame 1:
1
DataFrame 2:
452
1175
12
DataFrame 3:
DataFrame 4:
DataFrame 5:
145
125
DataFrame 6:
1259
2178
2514
1
如果没有循环,我怎么能这样做?
答案 0 :(得分:2)
更新:感谢@piRSquared指出上述解决方案不适用于具有非数字索引的DF / Series。这是更通用的解决方案:
dfs = [x.dropna()
for x in np.split(df, np.arange(len(df))[df['column'].isnull().values])]
OLD回答:
IIUC你可以这样做:
来源DF:
In [40]: df
Out[40]:
column
0 NaN
1 1.0
2 NaN
3 452.0
4 1175.0
5 12.0
6 NaN
7 NaN
8 NaN
9 145.0
10 125.0
11 NaN
12 1259.0
13 2178.0
14 2514.0
15 1.0
<强>解决方案:强>
In [31]: dfs = [x.dropna()
for x in np.split(df, df.index[df['column'].isnull()].values+1)]
In [32]: dfs[0]
Out[32]:
Empty DataFrame
Columns: [column]
Index: []
In [33]: dfs[1]
Out[33]:
column
1 1.0
In [34]: dfs[2]
Out[34]:
column
3 452.0
4 1175.0
5 12.0
In [35]: dfs[3]
Out[35]:
Empty DataFrame
Columns: [column]
Index: []
In [36]: dfs[4]
Out[36]:
Empty DataFrame
Columns: [column]
Index: []
In [37]: dfs[4]
Out[37]:
Empty DataFrame
Columns: [column]
Index: []
In [38]: dfs[5]
Out[38]:
column
9 145.0
10 125.0
In [39]: dfs[6]
Out[39]:
column
12 1259.0
13 2178.0
14 2514.0
15 1.0
答案 1 :(得分:1)
w = np.append(np.where(np.isnan(df.iloc[:, 0].values))[0], len(df))
splits = {'DataFrame{}'.format(c): df.iloc[i+1:j]
for c, (i, j) in enumerate(zip(w, w[1:]))}
打印splits
以演示
for k, v in splits.items():
print(k)
print(v)
print()
DataFrame0
0
1 1.0
DataFrame1
0
3 452.0
4 1175.0
5 12.0
DataFrame2
Empty DataFrame
Columns: [0]
Index: []
DataFrame3
Empty DataFrame
Columns: [0]
Index: []
DataFrame4
0
9 145.0
10 125.0
DataFrame5
0
12 1259.0
13 2178.0
14 2514.0
15 1.0