背景故事:我有一个熊猫数据框scaledData
,它只是信息的标准df,如下所示:
COL NAME0 COL NAME1 ... COL NAME3 COL NAME4
0 Alabama 4.099099 ... 2.042345 1.392755
1 Alaska 1.396396 ... 1.000000 1.000000
2 Arizona 4.189189 ... 2.003257 1.537777
3 Arkansas 2.927928 ... 2.208723 1.007370
4 California 3.378378 ... 1.754930 2.012395
5 Colorado 3.378378 ... 3.282196 2.843435
6 Connecticut 5.000000 ... 1.452587 4.277286
7 Delaware 4.409692 ... 2.134501 1.970434
8 District of Columbia 5.000000 ... 1.000000 1.000000
9 Florida 4.628118 ... 1.806412 2.213038
10 Georgia 4.628118 ... 1.513896 2.748559
11 Hawaii 3.902494 ... 2.891694 3.872309
12 Idaho 1.090703 ... 2.978469 4.127419
13 Illinois 4.537415 ... 1.242970 1.888353
14 Indiana 4.537415 ... 2.368881 2.307914
15 Iowa 2.088435 ... 3.298368 3.421122
16 Kansas 2.723356 ... 2.791375 2.160330
17 Kentucky 3.902494 ... 1.692890 4.133744
18 Louisiana 2.451247 ... 1.000000 1.000000
19 Maine 3.448980 ... 2.535328 5.000000
20 Maryland 5.000000 ... 1.632194 1.046567
我想在此df中创建另一列Total
,其结果是将每个状态(COL NAME0)的所有列值相加后除以字典weights
的总和。此外,第E
列执行相同的总计操作,但仅适用于具有这些特定标记的列。 weights
字典的键是df的列名称,值是一个元组,其中包含各列的权重值(以前使用过但与该问题无关)和该列所属的类别。这是我当前的实现:
weights = {'COL NAME1': (2.14, 'E'), 'COL NAME2': (5.14, 'E'), 'COL NAME3': (10, 'G'), 'COL NAME4' : (5, 'E')}
eWeights = { key: value for key, value in weights.items() if value[1] == 'E'}
gWeights = { key: value for key, value in weights.items() if value[1] == 'G'}
#Total should be the result of adding each of the columns per COL NAME0 row
#and dividing by the sum of the weight values.
scaledData['Total'] = scaledData.sum(axis = 1, skipna = True)/ sum(list(weights.values())[0])
#Same calculation on only columns marked 'E'
for key in eWeights:
scaledData['E'] = scaledData['E'] + scaledData[key]
scaledData['E'] = scaledData['E'] / sum(list(eWeights.values())[0])
不幸的是,以上代码导致以下错误(由在Total
中创建scaledData
列的行引起):
TypeError: unsupported operand type(s) for +: 'float' and 'str'
我已经简化了scaledData
和weights
,但是任何解决方案或建议都会对我的实际df有更多的行和列帮助。感谢您的帮助,让我知道是否需要更多信息。
答案 0 :(得分:0)
您的df似乎存储为float。试试:
for key in eWeights:
scaledData['E'] = scaledData['E'].astype(float) + scaledData[key].astype(float)
scaledData['E'] / sum(list(eWeights.values())[0])
# should this be a print? Are you trying to set any values?