我有一个熊猫数据框,看起来像:
c1 c2 c3 c4 result
a b c d 1
b c d a 1
a e d b 1
g a f c 1
但是我想随机选择50%的行以交换顺序,并将结果列从1翻转为0(如下所示):
c1 c2 c3 c4 result
a b c d 1
d a b c 0 (we swapped c3 and c4 with c1 and c2)
a e d b 1
f c g a 0 (we swapped c3 and c4 with c1 and c2)
完成此操作的惯用方式是什么?
答案 0 :(得分:1)
您有大致的想法。洗净DataFrame并将其分成两半。然后修改一半并重新加入。
import numpy as np
np.random.seed(410112)
dfs = np.array_split(df.sample(frac=1), 2) # Shuffle then split in 1/2
# On one half set result to 0 and swap the columns
dfs[1]['result'] = 0
dfs[1] = dfs[1].rename(columns={'c1': 'c2', 'c2': 'c1', 'c3': 'c4', 'c4': 'c3'})
# Join Back
df = pd.concat(dfs).sort_index()
c1 c2 c3 c4 result
0 a b c d 1
1 c b a d 0
2 e a b d 0
3 g a f c 1