我有一个空矩阵,如果国家(索引)属于地区(列),我想将矩阵元素替换为1。
我尝试创建一个双循环,但是当我需要执行条件操作时会堆叠在一起。谢谢。 ([152行x 6列])。非常感谢。
west europe east europe latin america
Norway 0 0 0
Denmark 0 0 0
Iceland 0 0 0
Switzerland 0 0 0
Finland 0 0 0
Netherlands 0 0 0
Sweden 0 0 0
Austria 0 0 0
Ireland 0 0 0
Germany 0 0 0
Belgium 0 0 0
我在想:
matrix = pd.DataFrame(np.random.randint(1, size=(152, 6)), index=['# enumarate all the countries], columns=['west europe', 'east europe', 'latin america','north america','africa', 'asia'])
print (matrix)
for i in range (len(matrix)):
for j in range(len(matrix)):
if data[i] =='Africa' and data['Country'] = [ '#here enumarate all Africa countries':
matrix[i][j]==1
elif:
....
matrix[i][j]==1
else:
matrix[i][j]==0
print (matrix)
Sample data frame with countries and region:
Country Happiness Rank Happiness Score Economy Family Health Freedom Generosity Corruption Dystopia Job Satisfaction Region
0 Norway 1 7.537 1.616463 1.533524 0.796667 0.635423 0.362012 0.315964 2.277027 94.6 Western Europe
1 Denmark 2 7.522 1.482383 1.551122 0.792566 0.626007 0.355280 0.400770 2.313707 93.5 Western Europe
2 Iceland 3 7.504 1.480633 1.610574 0.833552 0.627163 0.475540 0.153527 2.322715 94.5 Western Europe
3 Switzerland 4 7.494 1.564980 1.516912 0.858131 0.620071 0.290549 0.367007 2.276716 93.7 Western Europe
4 Finland 5 7.469 1.443572 1.540247 0.809158 0.617951 0.245483 0.382612 2.430182 91.2 Western Europe
5 Netherlands 6 7.377 1.503945 1.428939 0.810696 0.585384 0.470490 0.282662 2.294804 93.8 Western Europe
答案 0 :(得分:2)
如果输入变量data
是一个DataFrame,则正如@Alollz所述,您可以使用熊猫pd.get_dummies函数。
类似这样的内容:pd.get_dummies(data, columns=['Region'])
输出结果如下:
Country HappinessRank HappinessScore Economy Family Health Freedom Generosity Corruption Dystopia JobSatisfaction Region_WesternEurope
0 Norway 1 7.537 1.616463 1.533524 0.796667 0.635423 0.362012 0.315964 2.277027 94.6 1
1 Denmark 2 7.522 1.482383 1.551122 0.792566 0.626007 0.355280 0.400770 2.313707 93.5 1
2 Iceland 3 7.504 1.480633 1.610574 0.833552 0.627163 0.475540 0.153527 2.322715 94.5 1
3 Switzerland 4 7.494 1.564980 1.516912 0.858131 0.620071 0.290549 0.367007 2.276716 93.7 1
4 Finland 5 7.469 1.443572 1.540247 0.809158 0.617951 0.245483 0.382612 2.430182 91.2 1
5 Netherlands 6 7.377 1.503945 1.428939 0.810696 0.585384 0.470490 0.282662 2.294804 93.8 1
它将进入Region
类别列,并将其放入指示器列。在这种情况下,它使用列名作为前缀,但是您可以使用它。