尝试根据行索引将数据帧分解为train,val和test数据帧,例如观察1将进入训练,2进入val,3进入测试,但是,我正在遇到障碍。到目前为止,这是我的代码:
climbingTngDataset = pd.DataFrame([])
climbingValDataset = pd.DataFrame([])
climbingTestDataset = pd.DataFrame([])
for i in range(len(dfClimbing)):
if i % 2 == 0:
climbingValDataset.append(i)
if i % 3 == 0:
climbingTestDataset.append(i)
else:
climbingTngDataset.append(i)
答案 0 :(得分:1)
使用groupby
分割您的dataFrame:
train, test, val = [
g for _, g in dfClimbing.groupby(dfClimbing.index % 3)
]
<强>演示强>
(有两个分裂而不是3个)
print(df)
Record ID Para Tag
0 1 A x
1 1 A y
2 2 B x
3 2 B y
4 1 A z
i, j = [g for _, g in df.groupby(df.index % 2)]
print(i)
Record ID Para Tag
0 1 A x
2 2 B x
4 1 A z
print(j)
Record ID Para Tag
1 1 A y
3 2 B y