对于该示例,我该如何做“单次”编码的逆过程?

时间:2019-07-17 13:25:11

标签: python encoding multiple-columns feature-extraction

我有一个Pandas数据集,其中有很多列,其中一些列包括:

  1. 国家名称
  2. 国家代码
  3. 指标名称
  4. 1960
  5. 1961
  6. 1962 ...一直到2015年

我想做的是将所有年份归为一列(将1960年至2015年放在称为year的一列中。

类似地,在1960-1961列等下方,存在一些值,我要做的是将这些值设置为一个名为values的列。

在任务结束时,我想要一个这样的数据集:

  1. 国家名称
  2. 国家代码
  3. 指标名称
  4. 年份

我知道如何通过使用get_dummies()工具进行“一次热”包围,但是我不确定如何使year的那些单个值显示为单个列并选择代表55个的那列年。

df = {"country_name": ["Afganistan", "Afganistan", "Afganistan", "Argentina", "Argentina", "Argentina", "Chile","Chile", "Chile", "England", "England", "England", "France", "France", "France", "Germany", "Germany", "Germany", "United States", "United States", "United States", "Bolivia", "Bolivia", "Bolivia"], 
"country_code": ["AFG","AFG","AFG", "ARG","ARG","ARG","CHI","CHI", "CHI","ENG","ENG","ENG","FRA","FRA","FRA","GER","GER","GER","USA", "USA","USA","BOL","BOL","BOL"],
"indicator_name":["adolescent fertility rate", "age dependency ratio", "arms export", "adolescent fertility rate", "age dependency ratio", "arms export", "adolescent fertility rate", "age dependency ratio", "arms export", "adolescent fertility rate", "age dependency ratio", "arms export", "adolescent fertility rate", "age dependency ratio", "arms export", "adolescent fertility rate", "age dependency ratio", "arms export", "adolescent fertility rate", "age dependency ratio", "arms export", "adolescent fertility rate", "age dependency ratio", "arms export"],
1060: [8,5,6, 5,3,9, 11,4,2, 20,17,6, 0,23,17, 8,6,22, 18,14,13, 10,2,8], 
1961: [9,3,9, 6,4,10, 12,5,3, 21,18,7, 1,24,18, 9,7,23, 19,15,14, 11,3,9], 
1962: [10,4,10, 7,5,11, 13,6,4, 22,19,8, 2,25,19, 10,8,24, 20,16,15, 12,4,10]}

以此类推,一直持续到2015年

相关列的结果应如下所示:

df = {"country_name": ["Afganistan", "Afganistan", "Afganistan", "Afganistan", "Afganistan", "Afganistan", "Afganistan", "Afganistan", "Afganistan", "Argentina", "Argentina", "Argentina", "Argentina", "Argentina", "Argentina", "Argentina", "Argentina", "Argentina", "Chile","Chile", "Chile", "Chile","Chile", "Chile", "Chile","Chile", "Chile", "England", "England", "England", "England", "England", "England", "England", "England", "England", "France", "France", "France", "France", "France", "France", "France", "France", "France", "Germany", "Germany", "Germany", "Germany", "Germany", "Germany", "Germany", "Germany", "Germany", "United States", "United States", "United States", "United States", "United States", "United States", "United States", "United States", "United States", "Bolivia", "Bolivia", "Bolivia", "Bolivia", "Bolivia", "Bolivia", "Bolivia", "Bolivia", "Bolivia"], 
"country_code": ["AFG","AFG","AFG", "AFG","AFG","AFG", "AFG","AFG","AFG", "ARG","ARG","ARG", "ARG","ARG","ARG", "ARG","ARG","ARG", "CHI","CHI", "CHI", "CHI","CHI", "CHI", "CHI","CHI", "CHI","ENG","ENG","ENG", "ENG","ENG","ENG", "ENG","ENG","ENG","FRA","FRA","FRA", "FRA","FRA","FRA","FRA","FRA","FRA", "GER","GER","GER", "GER","GER","GER", "GER","GER","GER", "USA", "USA","USA", "USA", "USA","USA", "USA", "USA","USA", "BOL","BOL","BOL", "BOL","BOL","BOL", "BOL","BOL","BOL"],
"indicator_name":["adolescent fertility rate", "adolescent fertility rate","adolescent fertility rate","age dependency ratio","age dependency ratio","age dependency ratio","arms export","arms export", "arms export", "adolescent fertility rate", "adolescent fertility rate","adolescent fertility rate","age dependency ratio","age dependency ratio","age dependency ratio","arms export","arms export","arms export", "adolescent fertility rate", "adolescent fertility rate","adolescent fertility rate","age dependency ratio","age dependency ratio","age dependency ratio","arms export","arms export","arms export", "adolescent fertility rate", "adolescent fertility rate","adolescent fertility rate","age dependency ratio","age dependency ratio","age dependency ratio","arms export","arms export","arms export", "adolescent fertility rate", "adolescent fertility rate","adolescent fertility rate","age dependency ratio","age dependency ratio","age dependency ratio","arms export","arms export","arms export","adolescent fertility rate", "adolescent fertility rate","adolescent fertility rate","age dependency ratio","age dependency ratio","age dependency ratio","arms export","arms export","arms export", "adolescent fertility rate", "adolescent fertility rate","adolescent fertility rate","age dependency ratio","age dependency ratio","age dependency ratio","arms export","arms export","arms export", "adolescent fertility rate", "adolescent fertility rate","adolescent fertility rate","age dependency ratio","age dependency ratio","age dependency ratio","arms export","arms export","arms export"],
"year": [1960, 1960, 1960, 1961, 1961, 1961, 1962, 1962, 1962, 1960, 1960, 1960, 1961, 1961, 1961, 1962, 1962, 1962, 1960, 1960, 1960, 1961, 1961, 1961, 1962, 1962, 1962, 1960, 1960, 1960, 1961, 1961, 1961, 1962, 1962, 1962, 1960, 1960, 1960, 1961, 1961, 1961, 1962, 1962, 1962, 1960, 1960, 1960, 1961, 1961, 1961, 1962, 1962, 1962, 1960, 1960, 1960, 1961, 1961, 1961, 1962, 1962, 1962, 1960, 1960, 1960, 1961, 1961, 1961, 1962, 1962, 1962],
"value": [8,5,6,9,3,9,10,4,10,  5,3,9,6,4,10,7,5,11,  11,4,2,12,5,3,13,6,4,  20,17,6,21,18,7,22,19,8,  0,23,17,1,24,18,2,25,19,  8,6,22,9,7,23,10,8,24,  18,14,13,19,15,14,20,16,15, 10,2,8,11,3,9,12,4,10] 
}

0 个答案:

没有答案