Question

df = mdb.read_table(mdbfile, "table")
invoices = pd.read_csv(file, delimiter=';')

lst = df[(df['El4'] == el4)] #contains specific rows of df

for i, row in lst.iterrows():
    prop = row['propertyid']
    mouvement = (row['Mouvements']*-1)

    a = invoices[(invoices['propertyReference'] == prop) & (invoices.invoiceGrossAmount == mouvement)]
    invoiceid = a['invoiceId'].values

    mouvement = (mouvement*-1)

    if df[(df.propertyid == prop) & (df.Mouvements == mouvement)]:
        df['id'] = invoiceid

我收到以下错误：

The truth value of a DataFrame is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().

我想在数据框的行中填写特定值（invoiceId），propertyid等于prop且mouvements等于mouvement。< / p>

Answer 1

跟进我的评论。看起来你只是想加入（或者，在pandas术语中，merge）。

让我们从您的源数据开始：

df = mdb.read_table(mdbfile, "table")
invoices = pd.read_csv(file, delimiter=';')

从这里开始，我们想尝试加入数据：

df = df.merge(invoices, how='left', left_on=['propertyid', 'Mouvements'], right_on=['propertyReference', 'invoiceGrossAmount'])

在此加入中，我假设'propertyid'中的df与'propertyReference'中的invoices匹配，'Mouvements'中的df匹配在'invoiceGrossAmount'中使用invoices。您可以根据需要进行调整。

我们正在使用左连接，因为当我们在null中找不到匹配而不是不包括那些行时，我们希望原始df中包含invoices个值（其中）如果我们使用how='inner'代替。）

无需以这种方式使用for循环。我记得在某个地方读过，如果你正在使用一个带有pandas的循环，那么使用内置的pandas方法很有可能有更好的方法。

Answer 2

另一个想法是使用combine first方法。

要使用此方法，您需要确保两个数据框中的索引相等。类似的东西：

# Its not clear is the sign on Movement needs to be changed to merge with invoices. If so, comment out the line below
df.loc[:,'Mouvement] = df.loc[:,'Mouvement]*-1
df = df.set_index('propertyid','Mouvement')
invoices = invoices.set_index('propertyReference', 'invoiceGrossAmount')
df = df.combine_first(invoices)

这与@RagingRoosevelt建议的merge方法非常相似。

如何按多列过滤数据框并添加值

2 个答案: