我的数据框:
Name Percent Subject1 Subject2
ramesh 85 Maths Science
ram 42 Maths
Raj 85 NaN Science
输出数据框:
Name Percent Subject
ramesh 85 Maths
ramesh 85 Science
ram 42 Maths
Raj 85 Science
答案 0 :(得分:2)
pd.lreshape
可以将多列中的值合并为一个:
import numpy as np
import pandas as pd
nan = np.nan
df = pd.DataFrame({'Name': ['ramesh', 'ram', 'Raj'],
'Percent': [85, 42, 85],
'Subject1': ['Maths', 'Maths', nan],
'Subject2': ['Science', nan, 'Science']})
print(pd.lreshape(df, {'Subject':['Subject1', 'Subject2']}))
产量
Name Percent Subject
0 ramesh 85 Maths
1 ram 42 Maths
2 ramesh 85 Science
3 Raj 85 Science
pd.lreshape
似乎没有在online docs中记录(但是?)。这是它的文档字符串:
In [40]: help(pd.lreshape)
Help on function lreshape in module pandas.core.reshape:
lreshape(data, groups, dropna=True, label=None)
Reshape long-format data to wide. Generalized inverse of DataFrame.pivot
Parameters
----------
data : DataFrame
groups : dict
{new_name : list_of_columns}
dropna : boolean, default True
Examples
--------
>>> import pandas as pd
>>> data = pd.DataFrame({'hr1': [514, 573], 'hr2': [545, 526],
... 'team': ['Red Sox', 'Yankees'],
... 'year1': [2007, 2008], 'year2': [2008, 2008]})
>>> data
hr1 hr2 team year1 year2
0 514 545 Red Sox 2007 2008
1 573 526 Yankees 2007 2008
>>> pd.lreshape(data, {'year': ['year1', 'year2'], 'hr': ['hr1', 'hr2']})
team hr year
0 Red Sox 514 2007
1 Yankees 573 2007
2 Red Sox 545 2008
3 Yankees 526 2008
Returns
-------
reshaped : DataFrame
答案 1 :(得分:0)
您可以使用stack
,reset_index
和rename
列0
:
print df.set_index(['Name', 'Percent']).stack().reset_index(level=[0,1])
.reset_index(drop=True).rename(columns={0:'Subject'})
Name Percent Subject
0 ramesh 85 Maths
1 ramesh 85 Science
2 ram 42 Maths
3 Raj 85 Science