根据字典的键创建新列?

时间:2020-04-22 21:25:02

标签: python pandas string-literals data-wrangling

我正在尝试在字典项的for循环内的数据帧中使用字符串文字和键来在数据帧中创建新列,但会引发“ ValueError:无法设置没有定义索引和标量的帧”错误信息。

exp类别的字典定义

  d = {'Travel & Entertainment': [1,2,3,4,5,6,7,8,9,10,11], 'Office supplies & Expenses': [13,14,15,16,17],
    'Professional Fees':[19,20,21,22,23], 'Fees & Assessments':[25,26,27], 'IT Expenses':[29],
    'Bad Debt Expense':[31],'Miscellaneous expenses': [33,34,35,36,37],'Marketing Expenses':[40,41,42],
    'Payroll & Related Expenses': [45,46,47,48,49,50,51,52,53,54,55,56], 'Total Utilities':[59,60],
    'Total Equipment Maint, & Rental Expense': [63,64,65,66,67,68],'Total Mill Expense':[70,71,72,73,74,75,76,77],
    'Total Taxes':[80,81],'Total Insurance Expense':[83,84,85],'Incentive Compensation':[88],
    'Strategic Initiative':[89]}

基于主数据框创建新的数据框

mcon = VA.loc[:,['Expense', 'Mgrl', 'Exp Category', 'Parent Category']]
mcon.loc[:,'Variance Type'] = ['Unfavorable' if x < 0 else 'favorable' for x in mcon['Mgrl']]
mcon.loc[:,'Business Unit'] = 'Managerial Consolidation'
mcon = mcon[['Business Unit', 'Exp Category','Parent Category', 'Expense', 'Mgrl', 'Variance Type']]
mcon.rename(columns={'Mgrl':'Variance'}, inplace=True)

创建一个将最终写入excel的新数据框

a1 = pd.DataFrame() 
for key, value in d.items():
    umconm = mcon.iloc[value].query('Variance < 0').nsmallest(5, 'Variance')
    fmconm = mcon.iloc[value].query('Variance > 0').nlargest(5, 'Variance')
    if umconm.empty == False or fmconm.empty == False:
        a1 = pd.concat([a1,umconm,fmconm], ignore_index = True)
    else:
        continue
a1.to_csv('example.csv', index = False)

输出看起来像这样

enter image description here

我正尝试添加一个新列,说明预算比{key}高/低,其中key表示使用以下代码的费用类型

for key, value in d.items():
    umconm = mcon.iloc[value].query('Variance < 0').nsmallest(5, 'Variance')
    umconm.loc[:,'Explanation'] = f'Lower than budgeted {key}'
    fmconm = mcon.iloc[value].query('Variance > 0').nlargest(5, 'Variance')
    fmconm.loc[:,'Explanation'] = f'Higher than budgeted {key}'
    if umconm.empty == False or fmconm.empty == False:
        a1 = pd.concat([a1,umconm,fmconm], ignore_index = True)
    else:
        continue

但是使用上面的字符串文字会给我错误消息“ ValueError:无法设置没有定义索引和标量的框架”

我非常感谢您提供任何帮助来纠正此问题或找到其他解决方案来将此字段添加到我的数据框中。提前致谢!

2 个答案:

答案 0 :(得分:3)

发生此错误是因为此行

umconm = mcon.iloc[value].query('Variance < 0').nsmallest(5, 'Variance')

有时会产生没有索引的空数据框。要设置列(而不是loc)时,请改用这种方法:

a['Explanation'] = f'Lower than budgeted {key}'

答案 1 :(得分:0)

我真傻,解决方法如下:

for key, value in d.items():
    umconm = mcon.iloc[value].query('Variance < 0').nsmallest(5, 'Variance')
    umconm['Explanation'] = f'Higher than Budget for {key}'
    fmconm = mcon.iloc[value].query('Variance > 0').nlargest(5, 'Variance')
    fmconm['Explanation'] = f'Lower than Budget for {key}'
    if umconm.empty == False or fmconm.empty == False:
        a1 = pd.concat([a1,umconm,fmconm], ignore_index = True)
    else:
        continue

在此数据帧中创建新列时,我不必使用.loc!