Question

我正在尝试从数据框中删除某些信息，但“删除命令”(.drop) 无法像任何人知道的那样工作？

我的代码：

    import pandas as pd

def join():

    open_momox_xlsx = "momox_ergebnisse.xlsx"
    open_rebuy_xlsx = "rebuy_ergebnisse.xlsx"

    rebuy_xlsx = pd.read_excel(open_rebuy_xlsx)
    momox_xlsx = pd.read_excel(open_momox_xlsx)

    rebuy_data = rebuy_xlsx[['ReBuy']]
    isbn_data = rebuy_xlsx[['ISBN']]
    momox_data = momox_xlsx[['Momox']]

    dataframe = pd.DataFrame =({'ISBN': isbn_data, 'Rebuy': rebuy_data, 'Momox': momox_data})
    data = pd.concat(dataframe,axis=1, ignore_index=True)

    c=0
    #print(data[0])
    while c < len(data):

        if data[1][c] and data[2][c] == '///':
            data.drop(index=c)
        elif data[1][c] and data[2][c] < '1':
            data.drop(index=c)
        elif data[1][c] or data[2][c] < '1' and data[1][c] or data[2][c] == '///' :
            data.drop(index=c)
        c=c+1
    print(data)

输出：

                0      1      2
0   9783630876672  12,35   2.62
1   9783423282789  11,67   6.07
2   9783833879500  17,25  12.40
3   9783898798822   6,91   1.16
4   9783453281417  12,93   2.84
5   9783630876672  12,35   4.08
6   9783423282789  11,67   6.07
7   9783833879500  17,25   9.94
8   9783898798822   6,91   2.96
9   9783453281417  12,93   2.68
10     3927905909    ///    ///
11     3872948210    ///   0.15
12  9783293003781    ///   0.15
13  9783423246842    ///    ///
14  9783423247146    ///    ///
15  9783423246934    ///    ///
16     387294116x    ///    ///
17  9783935597456   0,16   0.15
18  9783423204545    ///    ///

想要的输出：

                0      1      2
0   9783630876672  12,35   2.62
1   9783423282789  11,67   6.07
2   9783833879500  17,25  12.40
3   9783898798822   6,91   1.16
4   9783453281417  12,93   2.84
5   9783630876672  12,35   4.08
6   9783423282789  11,67   6.07
7   9783833879500  17,25   9.94
8   9783898798822   6,91   2.96
9   9783453281417  12,93   2.68

if 语句似乎工作正常，但 data.drop 没有做它应该做的..

Answer 1

您应该将 inplace=True 添加到 drop 函数。

Answer 2

为了更好地解释 data.drop(index=c, inplace=True) 的建议，您可以将其视为将结果分配回相同的变量：data = data.drop(index=c)。尽管使用 inplace=True 的方法通常更好，但两者都应该有效。

Answer 3

制作干净的数据框并保留您想要的值：

data[['1', '2']] = data[['1', '2']].replace({"///": np.nan, ",": "."}, regex=True)
                                   .astype(float)
data = data.loc[data[["1", "2"]].ge(1.).all(axis="columns")]

>>> data
               0      1      2
0  9783630876672  12.35   2.62
1  9783423282789  11.67   6.07
2  9783833879500  17.25  12.40
3  9783898798822   6.91   1.16
4  9783453281417  12.93   2.84
5  9783630876672  12.35   4.08
6  9783423282789  11.67   6.07
7  9783833879500  17.25   9.94
8  9783898798822   6.91   2.96
9  9783453281417  12.93   2.68

评论：

第一行：

data[['1', '2']] 选择名为“1”和“2”的列
replace 将现有值（'///' 和 ','）更改为新值（'nan' 和 '.'）
astype(float) 将您的字符串列转换为实数（浮点数），因为您的数据框已被清理。

第二行：

data.loc[...] 在您的数据框中找到某些内容
data[["1", "2"]].ge(1.).all(axis="columns")：在“1”和“2”列中，搜索值“大于或等于 1”，并且该行的“所有列”必须为真。

Python - 从数据框中删除行（熊猫）

3 个答案: