Question

我有一个类似这样的df，

df

    a          b       c            d            e         f
0  Banana    Orange   Lychee     Custardapples Jackfruit  Pineapple
1   Apple    Pear   Strawberry   Muskmelon    Apricot    Peach
2  Raspberry Cherry  Plum           Kiwi        Mango   Blackberry

我想从每列中随机删除一个值。

例如：

        a          b       c            d            e         f
 0    Banana    Orange             Custardapples Jackfruit  
 1               Pear     Strawberry               Apricot    Peach
 2  Raspberry            Plum           Kiwi                Blackberry

Answer 1

您可以使用随机的x, y坐标并将其设置为""：

for i in range(df.shape[1]):
    df.iloc[np.random.randint(df.shape[0]), i] = ""

完整代码：

import pandas as pd
import numpy as np

df = pd.read_clipboard()
print(df)

           a       b           c              d          e           f
0     Banana  Orange      Lychee  Custardapples  Jackfruit   Pineapple
1      Apple    Pear  Strawberry      Muskmelon    Apricot       Peach
2  Raspberry  Cherry        Plum           Kiwi      Mango  Blackberry

所有列的

for循环：

for i in range(df.shape[1]):
    df.iloc[np.random.randint(df.shape[0]), i] = ""

           a       b       c              d          e           f
0             Orange  Lychee  Custardapples  Jackfruit   Pineapple
1      Apple                      Muskmelon    Apricot            
2  Raspberry  Cherry    Plum                            Blackberry

Answer 2

将pandas内置方法Series.sample与n=1一起使用。我用NaN替换了值，因为这样更优雅：

for col in df.columns:
    df.loc[df[col].sample(n=1).index, col] = np.NaN

           a       b       c              d          e          f
0        NaN     NaN  Lychee  Custardapples  Jackfruit  Pineapple
1      Apple    Pear     NaN      Muskmelon    Apricot      Peach
2  Raspberry  Cherry    Plum            NaN        NaN        NaN

如果您实际上想要空格，请将np.NaN替换为''

从pandas数据框中从每列中随机删除一个值？

2 个答案: