多列上从宽到长

时间:2019-11-27 10:03:29

标签: python pandas

我想将宽数据转换为长格式。

我学习了如何使用pd.melt做到这一点,但是当您有多列时,这似乎不起作用。因此,我尝试将pd.concat与pd.melt一起使用,但未产生我想要的结果

这是我的数据样本

{'secondBoxer1': {0: '"Cody Sons"',
  1: '"Billy Martin"',
  2: '"Jennifer Michelle Woods"',
  3: '"Francisco Velasquez"',
  4: '"Mark Anderson"'},
 'secondBoxerLast61': {0: '[] ',
  1: '["unknown" unknown unknown unknown loss loss] ',
  2: '[] ',
  3: '[] ',
  4: '["loss" loss loss loss loss loss] '},
 'secondBoxerRating1': {0: '[null null] ',
  1: '[null null] ',
  2: '[null null] ',
  3: '[null null] ',
  4: '[null null] '},
 'secondBoxerRecord1': {0: '{"draw"0 loss0 win0} ',
  1: '{"draw"0 loss2 win0} ',
  2: '{"draw"0 loss0 win0} ',
  3: '{"draw"0 loss0 win0} ',
  4: '{"draw"1 loss14 win3} '},
 'secondBoxerWeight1': {0: 214.0, 1: 198.0, 2: nan, 3: 108.75, 4: 163.5},
 'secondBoxer2': {0: '"Tamis Long"',
  1: '"Danyelle Williams"',
  2: '"Leesa Daniels"',
  3: '"Hector Herrera"',
  4: '"Coy Lanbert"'},
 'secondBoxerLast62': {0: '["unknown" unknown unknown win loss loss] ',
  1: '["unknown" unknown unknown unknown win win] ',
  2: '["unknown" unknown loss win loss loss] ',
  3: '["unknown" unknown unknown unknown unknown loss] ',
  4: '["win" loss loss win win win] '},
 'secondBoxerRating2': {0: '[null null] ',
  1: '[null null] ',
  2: '[null null] ',
  3: '[null null] ',
  4: '[null null] '},
 'secondBoxerRecord2': {0: '{"draw"0 loss2 win1} ',
  1: '{"draw"0 loss0 win2} ',
  2: '{"draw"0 loss3 win1} ',
  3: '{"draw"0 loss1 win0} ',
  4: '{"draw"0 loss19 win6} '},
 'secondBoxerWeight2': {0: 207.5, 1: 238.25, 2: 122.0, 3: 106.75, 4: 161.0},
 'secondBoxer3': {0: '"Davin Clark"',
  1: '"Delbert Peters"',
  2: '"Kanisca Feliciano Ruiz"',
  3: '"Luis Gonzalez"',
  4: nan},
 'secondBoxerLast63': {0: '["unknown" unknown unknown unknown unknown win] ',
  1: '[] ',
  2: '[] ',
  3: '["loss" loss win loss loss loss] ',
  4: '[] '},
 'secondBoxerRating3': {0: '[null null] ',
  1: '[null null] ',
  2: '[null null] ',
  3: '[null null] ',
  4: '[null null] '},
 'secondBoxerRecord3': {0: '{"draw"0 loss0 win1} ',
  1: '{"draw"0 loss0 win0} ',
  2: '{"draw"0 loss0 win0} ',
  3: '{"draw"0 loss6 win1} ',
  4: '{"draw"null lossnull winnull} '},
 'secondBoxerWeight3': {0: 198.0, 1: 300.0, 2: 114.0, 3: nan, 4: nan},
 'secondBoxer4': {0: '"Fernando Caro"',
  1: '"John Carmona"',
  2: nan,
  3: '"Jose Meza"',
  4: nan},
 'secondBoxerLast64': {0: '["unknown" unknown win win win loss] ',
  1: '[] ',
  2: nan,
  3: '[] ',
  4: nan},
 'secondBoxerRating4': {0: '[null null] ',
  1: '[null null] ',
  2: nan,
  3: '[null null] ',
  4: nan},
 'secondBoxerRecord4': {0: '{"draw"0 loss1 win3} ',
  1: '{"draw"0 loss0 win0} ',
  2: nan,
  3: '{"draw"0 loss0 win0} ',
  4: nan},
 'secondBoxerWeight4': {0: 212.5, 1: 314.75, 2: nan, 3: nan, 4: nan},
 'secondBoxer5': {0: '"Fernando Caro"',
  1: '"Ryan Watson"',
  2: nan,
  3: '"Jose Gutierrez"',
  4: nan},
 'secondBoxerLast65': {0: '["unknown" win win win loss loss] ',
  1: '["unknown" win win draw loss win] ',
  2: nan,
  3: '["unknown" unknown unknown unknown loss loss] ',
  4: nan},
 'secondBoxerRating5': {0: '[null null] ',
  1: '[null null] ',
  2: nan,
  3: '[null null] ',
  4: nan},
 'secondBoxerRecord5': {0: '{"draw"0 loss2 win3} ',
  1: '{"draw"1 loss1 win3} ',
  2: nan,
  3: '{"draw"0 loss2 win0} ',
  4: nan},
 'secondBoxerWeight5': {0: 202.0, 1: 281.0, 2: nan, 3: 114.75, 4: nan},
 'name': {0: 'Roberto Salas',
  1: 'James Jackson',
  2: 'Alex Love',
  3: 'Juan Centeno',
  4: 'Jordan Weeks'}}

这是我写的:

testing = list(chain.from_iterable(('secondBoxer'+str(i), 'secondBoxerLast6'+str(i),'secondBoxerRating'+str(i),'secondBoxerRecord'+str(i),'secondBoxerWeight'+str(i)) for i in range(1, 6)))

pd.concat([pd.melt(tist,id_vars='name',value_vars=i,var_name='vars',value_name='value') for i in testing])

这将产生以下数据帧:

         name             vars              value
0   Roberto Salas   secondBoxer1         "Cody Sons"
1   James Jackson   secondBoxer1         "Billy Martin"
2   Alex Love       secondBoxer1      "Jennifer Michelle Woods"
3   Juan Centeno    secondBoxer1       "Francisco Velasquez"
4   Jordan Weeks    secondBoxer1          "Mark Anderson"
...     ...              ...                 ...
0   Roberto Salas   secondBoxerWeight5       202
1   James Jackson   secondBoxerWeight5       281
2   Alex Love       secondBoxerWeight5       NaN
3   Juan Centeno    secondBoxerWeight5      114.75
4   Jordan Weeks    secondBoxerWeight5       NaN

我希望将此作为我的输出:

         name             vars              value                  vars2                values2      
0   Roberto Salas   secondBoxer1         "Cody Sons"            secondBoxerWeight1        xx
1   James Jackson   secondBoxer1         "Billy Martin"         secondBoxerWeight1        xx
2   Alex Love       secondBoxer1      "Jennifer Michelle Woods" secondBoxerWeight1        xx
3   Juan Centeno    secondBoxer1       "Francisco Velasquez"    secondBoxerWeight1        xx
4   Jordan Weeks    secondBoxer1          "Mark Anderson"       secondBoxerWeight1        xx

vars3-6是我数据集中每个战斗机的secondBoxerLast6,secondBoxerRating和secondBoxerRecord的列

0 个答案:

没有答案