我有一个小样本数据
import pandas as pd
d = {
'title': ['string1', 'string2', 'string3', 'string4', 'string5', 'string6'],
'Num/Den': ['Numerator', 'Denominator', 'Numerator', 'Denominator', 'Numerator','Denominator',
'Numerator','Denominator','Numerator', 'Denominator', 'Numerator', 'Denominator'],
'two': ['tstring1', 'tstring2', 'tstring3', 'tstring4', 'tstring5', 'tstring6',
'tstring7', 'tstring8','tstring9','tstring10','tstring11','tstring12']
}
df = pd.DataFrame(d)
此代码不起作用,因为列没有相同的行!我不知道如何在stackoverflow上以另一种方式显示原始数据。
数据如下所示:
title Num/Den two
string1 Numerator tstring1
Denominator tstring2
string2 Numerator tstring3
Denominator tstring4
string3 Numerator tstring5
Denominator tstring6
string4 Numerator tstring7
Denominator tstring8
string5 Numerator tstring9
Denominator tstring10
string6 Numerator tstring11
Denominator tstring12
我希望我的数据看起来像这样,基本上只需用前一个单元格的值填充空单元格:
title Num/Den two
string1 Numerator tstring1
string1 Denominator tstring2
string2 Numerator tstring3
string2 Denominator tstring4
string3 Numerator tstring5
string3 Denominator tstring6
string4 Numerator tstring7
string4 Denominator tstring8
string5 Numerator tstring9
string5 Denominator tstring10
string6 Numerator tstring11
string6 Denominator tstring12
答案 0 :(得分:2)
您可以将空字符串替换为nan/None
,然后执行ffill
:
df['title'] = df.title.replace("", pd.np.nan).ffill()
df
# Num/Den title two
#0 Numerator string1 tstring1
#1 Denominator string1 tstring2
#2 Numerator string2 tstring3
#3 Denominator string2 tstring4
# ...
答案 1 :(得分:2)
您可以使用numpy
' repeat
功能:
d['title'] = np.repeat(d['title'], 2)
df = pd.DataFrame(d)
数据框示例:
Num/Den title two
0 Numerator string1 tstring1
1 Denominator string1 tstring2
2 Numerator string2 tstring3
3 Denominator string2 tstring4
4 Numerator string3 tstring5
5 Denominator string3 tstring6
6 Numerator string4 tstring7
7 Denominator string4 tstring8
8 Numerator string5 tstring9
9 Denominator string5 tstring10
10 Numerator string6 tstring11
11 Denominator string6 tstring12