Question

我搜索了我的确切问题无济于事。这两个线程Creating a new column based on if-elif-else condition和虽然我的代码无法执行，但create new pandas dataframe column based on if-else condition with a lookup引导了我的代码。

问题：我有一个数据框，我在下面的例子中再现过。 Region属性只有两个值 - a或b（或可能有更多），年份相同，但是a区域可能有两年等等。我想要做的是创建一个新列，＆＃34; dollar＆＃34;，并查找区域的值，如果是区域＆＃34; a＆＃34; AND年是例如2006年，在该行中取销售额，并乘以当年的费率并在新列中追加价值 - 美元。我是初学者，下面是我到目前为止 - 通过函数 - 显然执行.apply函数返回 ValueError：（＆＃39;系列的真值是不明确的。使用a。空，a.bool（），a.item（），a.any（）或a.all（）。＆＃39;，＆＃39;出现在索引0＆＃39;）。我对更有效的实现特别感兴趣，因为数据帧相当大，并且希望优化计算效率。

import pandas as np

rate_2006, rate_2007 = 100, 200


c = {
'region': ["a", "a", "a", "a", "a", "b", "b", "b", "b", "a", "b"],
'year': [2006, 2007, 2007, 2006, 2006, 2006, 2007, 2007, 2007, 2006, 2007],
'sales': [500, 100, 2990, 15, 5000, 2000, 150, 300, 250, 1005, 600]
}

df1 = pd.DataFrame(c)
df1

def new_col(row): 
    if df1["region"] == "a" and df1["year"] == 2006:
        nc = row["sales"] * rate_2006
    elif df1["region"] == "a" and df1["year"] == 2007:
        nc = row["sales"] * rate_2007
    elif df1["region"] == "b" and df1["year"] == 2006:
        nc = row["sales"] * rate_2006
    else:
        nc = row["sales"] * rate_2007
    return nc

df1["Dollars"] = df1.apply(new_col, axis=1)
df1

Answer 1

问题可能是由于您使用它的方式。我不知道它是否会对你有所帮助。但是我根据我的知识重新编写了代码。

import pandas as pd

rate_2006, rate_2007 = 100, 200


c = {
'region': ["a", "a", "a", "a", "a", "b", "b", "b", "b", "a", "b"],
'year': [2006, 2007, 2007, 2006, 2006, 2006, 2007, 2007, 2007, 2006, 2007],
'sales': [500, 100, 2990, 15, 5000, 2000, 150, 300, 250, 1005, 600]
}

df1 = pd.DataFrame(c)
print(df1)

def new_col(value): 
    if df1.loc[value,"region"] == "a" and df1.loc[value,"year"] == 2006:
        df1.loc[value,"Dollars"] = df1.loc[value,"sales"] * rate_2006
    elif df1.loc[value,"region"] == "a" and df1.loc[value,"year"] == 2007:
        df1.loc[value,"Dollars"] = df1.loc[value,"sales"] * rate_2007
    elif df1.loc[value,"region"] == "b" and df1.loc[value,"year"] == 2006:
        df1.loc[value,"Dollars"] = df1.loc[value,"sales"] * rate_2006
    else:
        df1.loc[value,"Dollars"] = df1.loc[value,"sales"] * rate_2007

for value in range(len(df1)):
    new_col(value)

根据if / elif /和函数在pandas数据框中创建新列

1 个答案: