我的数据集有一些我想汇总的有趣列,因此创建了一个可用于进行更多分析的指标。
我编写的算法大约需要3秒钟才能完成,所以我想知道是否有更有效的方法来实现这一目标。
docker run --mount source=config,target=/configurations $image
以下是dictionary_of_parameters的截短版本:
def financial_score_calculation(df, dictionary_of_parameters):
for parameter in dictionary_of_parameters:
for i in dictionary_of_parameters[parameter]['target']:
index = df.loc[df[parameter] == i].index
for i in index:
old_score = df.at[i, 'financialliteracyscore']
new_score = old_score + dictionary_of_parameters[parameter]['score']
df.at[i, 'financialliteracyscore'] = new_score
for i in df.index:
old_score = df.at[i, 'financialliteracyscore']
new_score = (old_score/27.0)*100 #converting score to percent value
df.at[i, 'financialliteracyscore'] = new_score
return df
编辑:为df生成玩具数据
dictionary_of_parameters = {
# money management parameters
"SatisfactionLevelCurrentFinances": {'target': [8, 9, 10], 'score': 1},
"WillingnessFinancialRisk": {'target': [8, 9, 10], 'score': 1},
"ConfidenceLevelToEarn2000WithinMonth": {'target': [1], 'score': 1},
"DegreeOfWorryAboutRetirement": {'target': [1], 'score': 1},
"GoodWithYourMoney?": {'target': [7], 'score': 1}
}
答案 0 :(得分:0)
请注意,在熊猫中,您可以使用at
进行元素索引以外的其他方式。在下面的四行代码中,index
是一个列表,可用于与loc
进行索引。
for parameter in dictionary_of_parameters:
index = df[df[parameter].isin(dictionary_of_parameters[parameter]['target'])].index
df.loc[index,'financialliteracyscore'] += dictionary_of_parameters[parameter]['score']
df['financialliteracyscore'] = df['financialliteracyscore'] /27.0*100
这里是参考,尽管我个人在编程的早期从未觉得它有用... https://pandas.pydata.org/pandas-docs/stable/indexing.html