我有一个列表Rule
对象,它有一个evaluate
函数,它接受一个字符串并返回一个布尔值。
我想做以下事情:
r = Rules()
df["Rules"] = df["Words"].apply(lambda x: r.name if r.evalulate(x) else None)
但是我列出了几百个规则。我想将这种函数应用于数百个这些规则,并通过逗号连接结果r.name
,并将其作为我的df["Rules"]
。我很奇怪是否有人知道如何简洁地完成这项工作?
这就是我想出的所有内容:
for rule in rules:
df["Rules"] += df["Words"].apply(lambda x: rule.name if rule.evalulate(x) else "")
期望的输出:
df["Rules"]
Rule 1, Rule 5, Rule 8
Rule 3, Rule 5, Rule 6
Rule 2
nan
Rule 4, Rule 7
Name: Rules, Length: 5, dtype: object
我在这里留下我的最终代码,因为以后有人想出一个更好的解决方案(我保证有一个因为这需要永远),他们可以看到:
for i, account in enumerate(accounts):
cat_rules= CatRuleController.account_cat_rule(account).to_records
if len(cat_rules) <= 1:
continue
rules = [Rule(c[1], ' '.join(list(c)[2:]).rstrip()) for c in cat_rules]
df["Rule_{}".format(i)] = df["Words"].apply(lambda x: ','.join([cr.name if cr.evaluate(x) else ''
for cr in rules]))
rules_cols = [col for col in df if col.startswith('Rule_')]
df["Rules"] = df[rules_cols].apply(lambda row: ",".join(row.to_list()), axis=1)
答案 0 :(得分:0)
我知道很难向我们提供示例和所需的输出但我希望以下内容能为您提供一个想法
# First create a DataFrame of appropriate size and all values np.nan so that if something goes wrong we know
df_2 = pd.DataFrame(np.nan, index=np.arange(df['Words'].size), columns=['Rule_' + str(idx) for i in np.arange(len(rules))])
# Go over rules, populate each colmn Rule_#idx, observe I used "" instead of None to be able to use str.join later
for idx, rule in enumerate(rules):
df_2['Rule' + str(idx)] = df["Words"].apply(lambda x: rule.name if rule.evalulate(x) else "")
# Concatenate over each row with comma as delimeter
df_2.apply(lambda row: ",".join(row.to_list()), axis=1)