Python - 获取数据行并放入单列

时间:2017-08-20 11:36:24

标签: python pandas

我有以下具有大量行的数据帧。我想把多列并将其压缩到一列。

Player         |    0             |     1     |     2     | 3         |  4
Edgerrin James | 1st Tm All-Conf. | AP 1st Tm | FW 1st Tm | SN 1st Tm | Pro Bowl
Tony Gonzalez  | 1st Tm All-Conf. | AP 1st Tm | None      | None      | None
...            |  ...             | ...       | ...       | ...       | ...

我正在试图弄清楚如何重组它,所以奖项都在一栏中。所以它看起来像一个数据帧如下:

Player         | awardID
Edgerrin James | 1st Tm All-Conf.
Edgerrin James | AP 1st Tm
Edgerrin James | FW 1st Tm 
Edgerrin James | SN 1st Tm
Edgerrin James | Pro Bowl
Tony Gonzalez  | 1st Tm All-Conf.
Tony Gonzalez  | AP 1st Tm

如果还包括“无”单元格,我会很好,因为我知道如何过滤掉那些,但无法弄清楚第一部分。

2 个答案:

答案 0 :(得分:2)

set_indexPlayer

上使用stack
In [750]: df.set_index('Player').stack().reset_index(name='awardID').drop('level_1', 1)
Out[750]:
           Player           awardID
0  Edgerrin James  1st Tm All-Conf.
1  Edgerrin James         AP 1st Tm
2  Edgerrin James         FW 1st Tm
3  Edgerrin James         SN 1st Tm
4  Edgerrin James          Pro Bowl
5   Tony Gonzalez  1st Tm All-Conf.
6   Tony Gonzalez         AP 1st Tm
7   Tony Gonzalez              None
8   Tony Gonzalez              None
9   Tony Gonzalez              None

选择性地,使用None

删除query
In [757]: (df.set_index('Player')
             .stack()
             .reset_index(name='awardID')
             .drop('level_1', 1)
             .query('awardID != "None"'))
Out[757]:
           Player           awardID
0  Edgerrin James  1st Tm All-Conf.
1  Edgerrin James         AP 1st Tm
2  Edgerrin James         FW 1st Tm
3  Edgerrin James         SN 1st Tm
4  Edgerrin James          Pro Bowl
5   Tony Gonzalez  1st Tm All-Conf.
6   Tony Gonzalez         AP 1st Tm

答案 1 :(得分:1)

没有熊猫的解决方案 首先保存字符串中的任何行,如s

  def mylist(string):

   string = string.split('|')
   length = len(string)-1
   for i in range(length):
    print string[0],string[i+1:i+2],'\n'

  s1 = 'Edgerrin James | 1st Tm All-Conf. | AP 1st Tm | FW 1st Tm | SN 1st Tm | Pro Bowl'

  s2 = 'Tony Gonzalez  | 1st Tm All-Conf. | AP 1st Tm | None      | None      | None'


 mylist(s1)
 mylist(s2)

输出中:

Edgerrin James  [' 1st Tm All-Conf. '] 

Edgerrin James  [' AP 1st Tm '] 

Edgerrin James  [' FW 1st Tm '] 

Edgerrin James  [' SN 1st Tm '] 

Tony Gonzalez   [' 1st Tm All-Conf. '] 

Tony Gonzalez   [' AP 1st Tm '] 

Tony Gonzalez   [' None      '] 

Tony Gonzalez   [' None      '] 

Tony Gonzalez   [' None'] 

为所有玩家和行执行此操作