我有一个类似这样的df,
df
a b c d e f
0 Banana Orange Lychee Custardapples Jackfruit Pineapple
1 Apple Pear Strawberry Muskmelon Apricot Peach
2 Raspberry Cherry Plum Kiwi Mango Blackberry
我想从每列中随机删除一个值。
例如:
a b c d e f
0 Banana Orange Custardapples Jackfruit
1 Pear Strawberry Apricot Peach
2 Raspberry Plum Kiwi Blackberry
答案 0 :(得分:3)
您可以使用随机的x, y
坐标并将其设置为""
:
for i in range(df.shape[1]):
df.iloc[np.random.randint(df.shape[0]), i] = ""
完整代码:
import pandas as pd
import numpy as np
df = pd.read_clipboard()
print(df)
a b c d e f
0 Banana Orange Lychee Custardapples Jackfruit Pineapple
1 Apple Pear Strawberry Muskmelon Apricot Peach
2 Raspberry Cherry Plum Kiwi Mango Blackberry
所有列的for循环:
for i in range(df.shape[1]):
df.iloc[np.random.randint(df.shape[0]), i] = ""
a b c d e f
0 Orange Lychee Custardapples Jackfruit Pineapple
1 Apple Muskmelon Apricot
2 Raspberry Cherry Plum Blackberry
答案 1 :(得分:2)
将pandas
内置方法Series.sample
与n=1
一起使用。我用NaN
替换了值,因为这样更优雅:
for col in df.columns:
df.loc[df[col].sample(n=1).index, col] = np.NaN
a b c d e f
0 NaN NaN Lychee Custardapples Jackfruit Pineapple
1 Apple Pear NaN Muskmelon Apricot Peach
2 Raspberry Cherry Plum NaN NaN NaN
如果您实际上想要空格,请将np.NaN
替换为''