Python基于行索引拆分数据帧

时间:2018-05-28 22:15:30

标签: python pandas dataframe

尝试根据行索引将数据帧分解为train,val和test数据帧,例如观察1将进入训练,2进入val,3进入测试,但是,我正在遇到障碍。到目前为止,这是我的代码:

climbingTngDataset = pd.DataFrame([])
climbingValDataset = pd.DataFrame([])
climbingTestDataset = pd.DataFrame([])

for i in range(len(dfClimbing)):
    if i % 2 == 0:
       climbingValDataset.append(i) 
    if i % 3 == 0:
        climbingTestDataset.append(i)
    else:
        climbingTngDataset.append(i)

1 个答案:

答案 0 :(得分:1)

使用groupby分割您的dataFrame:

train, test, val = [
    g for _, g in dfClimbing.groupby(dfClimbing.index % 3)
]

<强>演示
(有两个分裂而不是3个)

print(df)
   Record ID Para Tag
0          1    A   x
1          1    A   y
2          2    B   x
3          2    B   y
4          1    A   z

i, j = [g for _, g in df.groupby(df.index % 2)]

print(i)
   Record ID Para Tag
0          1    A   x
2          2    B   x
4          1    A   z

print(j)
   Record ID Para Tag
1          1    A   y
3          2    B   y