切片并填充行/列pandas

时间:2017-05-09 20:48:23

标签: python pandas

我有一个小样本数据

import pandas as pd
d = {
 'title': ['string1', 'string2', 'string3', 'string4', 'string5', 'string6'],
 'Num/Den': ['Numerator', 'Denominator', 'Numerator', 'Denominator', 'Numerator','Denominator', 
             'Numerator','Denominator','Numerator', 'Denominator', 'Numerator', 'Denominator'],
 'two': ['tstring1', 'tstring2', 'tstring3', 'tstring4', 'tstring5', 'tstring6', 
         'tstring7', 'tstring8','tstring9','tstring10','tstring11','tstring12']

}
df = pd.DataFrame(d)

此代码不起作用,因为列没有相同的行!我不知道如何在stackoverflow上以另一种方式显示原始数据。

数据如下所示:

title         Num/Den             two
string1       Numerator          tstring1
              Denominator        tstring2
string2       Numerator          tstring3
              Denominator        tstring4
string3       Numerator          tstring5
              Denominator        tstring6    
string4       Numerator          tstring7
              Denominator        tstring8    
string5       Numerator          tstring9
              Denominator        tstring10
string6       Numerator          tstring11
              Denominator        tstring12 

我希望我的数据看起来像这样,基本上只需用前一个单元格的值填充空单元格:

title         Num/Den             two
string1       Numerator          tstring1
string1       Denominator        tstring2
string2       Numerator          tstring3
string2       Denominator        tstring4
string3       Numerator          tstring5
string3       Denominator        tstring6    
string4       Numerator          tstring7
string4       Denominator        tstring8    
string5       Numerator          tstring9
string5       Denominator        tstring10
string6       Numerator          tstring11
string6       Denominator        tstring12 

2 个答案:

答案 0 :(得分:2)

您可以将空字符串替换为nan/None,然后执行ffill

df['title'] = df.title.replace("", pd.np.nan).ffill()
df

#       Num/Den   title     two
#0  Numerator   string1 tstring1
#1  Denominator string1 tstring2
#2  Numerator   string2 tstring3
#3  Denominator string2 tstring4
# ...

答案 1 :(得分:2)

您可以使用numpy' repeat功能:

d['title'] = np.repeat(d['title'], 2)
df = pd.DataFrame(d)

数据框示例:

      Num/Den    title        two
0     Numerator  string1   tstring1
1   Denominator  string1   tstring2
2     Numerator  string2   tstring3
3   Denominator  string2   tstring4
4     Numerator  string3   tstring5
5   Denominator  string3   tstring6
6     Numerator  string4   tstring7
7   Denominator  string4   tstring8
8     Numerator  string5   tstring9
9   Denominator  string5  tstring10
10    Numerator  string6  tstring11
11  Denominator  string6  tstring12