从pandas数据框中从每列中随机删除一个值?

时间:2019-12-21 17:40:14

标签: python pandas dataframe

我有一个类似这样的df,

df

    a          b       c            d            e         f
0  Banana    Orange   Lychee     Custardapples Jackfruit  Pineapple
1   Apple    Pear   Strawberry   Muskmelon    Apricot    Peach
2  Raspberry Cherry  Plum           Kiwi        Mango   Blackberry

我想从每列中随机删除一个值。

例如:

        a          b       c            d            e         f
 0    Banana    Orange             Custardapples Jackfruit  
 1               Pear     Strawberry               Apricot    Peach
 2  Raspberry            Plum           Kiwi                Blackberry

2 个答案:

答案 0 :(得分:3)

您可以使用随机的x, y坐标并将其设置为""

for i in range(df.shape[1]):
    df.iloc[np.random.randint(df.shape[0]), i] = ""

完整代码:

import pandas as pd
import numpy as np

df = pd.read_clipboard()
print(df)
           a       b           c              d          e           f
0     Banana  Orange      Lychee  Custardapples  Jackfruit   Pineapple
1      Apple    Pear  Strawberry      Muskmelon    Apricot       Peach
2  Raspberry  Cherry        Plum           Kiwi      Mango  Blackberry
所有列的

for循环:

for i in range(df.shape[1]):
    df.iloc[np.random.randint(df.shape[0]), i] = ""
           a       b       c              d          e           f
0             Orange  Lychee  Custardapples  Jackfruit   Pineapple
1      Apple                      Muskmelon    Apricot            
2  Raspberry  Cherry    Plum                            Blackberry

答案 1 :(得分:2)

pandas内置方法Series.samplen=1一起使用。我用NaN替换了值,因为这样更优雅:

for col in df.columns:
    df.loc[df[col].sample(n=1).index, col] = np.NaN

           a       b       c              d          e          f
0        NaN     NaN  Lychee  Custardapples  Jackfruit  Pineapple
1      Apple    Pear     NaN      Muskmelon    Apricot      Peach
2  Raspberry  Cherry    Plum            NaN        NaN        NaN

如果您实际上想要空格,请将np.NaN替换为''