我有这个数据框:
df:
id . city . state . person1 . P1phone1 . P1phone2 . person2 . P2phone1 . P2phone2
1 . Ghut . TY . Jeff . 32131 . 4324 . Bill . 213123 . 31231
2 . Ffhs . TY . Ron . 32131 . 4324 . Bill . 213123 . 31231
3 . Tyuf . TY . Jeff . 32131 . 4324 . Tyler . 213123 . 31231
我希望它看起来像这样:
df:
id . city . state . person . phone1 . phone2
1 . Ghut . TY . Jeff . 32131 . 4324
2 . Ghut . TY . Bill . 213123 . 31231
3 . Ffhs . TY . Ron . 32131 . 4324
4 . Ffhs . TY . Bill . 213123 . 31231
5 . Tyuf . TY . Jeff . 32131 . 4324
6 . Tyuf . TY . Tyler . 213123 . 31231
我很难做到这一点。有人可以帮忙吗?
答案 0 :(得分:3)
df
id city state person1 P1phone1 P1phone2 person2 P2phone1 P2phone2
1 Ghut TY Jeff 32131 4324 Bill 213123 31231
2 Ffhs TY Ron 32131 4324 Bill 213123 31231
3 Tyuf TY Jeff 32131 4324 Tyler 213123 31231
df = df.set_index(['city', 'state'])
df.columns = np.tile(df.columns[:3].values, 2)
使用df.append
df = df.iloc[:, :3].append(df.iloc[:, 3:]).reset_index()
df
city state person1 P1phone1 P1phone2
0 Ghut TY Jeff 32131 4324
1 Ffhs TY Ron 32131 4324
2 Tyuf TY Jeff 32131 4324
3 Ghut TY Bill 213123 31231
4 Ffhs TY Bill 213123 31231
5 Tyuf TY Tyler 213123 31231
使用pd.concat
df = pd.concat([df.iloc[:, :3].append(df.iloc[:, 3:])]).reset_index()
df
city state person1 P1phone1 P1phone2
0 Ghut TY Jeff 32131 4324
1 Ffhs TY Ron 32131 4324
2 Tyuf TY Jeff 32131 4324
3 Ghut TY Bill 213123 31231
4 Ffhs TY Bill 213123 31231
5 Tyuf TY Tyler 213123 31231
答案 1 :(得分:2)
您可以将数据框切成两部分:
In [20]: df1 = df[['city','state','person1','P1phone1','P1phone2']]
In [21]: df2 = df[['city','state','person2','P2phone1','P2phone2']]
然后确保它们具有相同的列:
In [27]: df1.columns = ['city','state','person','phone1','phone2']
In [28]: df2.columns = ['city','state','person','phone1','phone2']
然后将一个附加到另一个上:
In [29]: df1.append(df2)
Out[29]:
city state person phone1 phone2
0 Ghut TY Jeff 32131 4324
1 Ffhs TY Ron 32131 4324
2 Tyuf TY Jeff 32131 4324
0 Ghut TY Bill 213123 31231
1 Ffhs TY Bill 213123 31231
2 Tyuf TY Tyler 213123 31231
答案 2 :(得分:1)
试试这个兄弟
pd.concat(df.loc[:,['id','city','state','person1','P1phone1','P1phone2']].
rename(columns = {'person1' : 'person', 'P1phone1' : 'phone1', 'P1phone2' : 'phone2'),
df.loc[:,['id','city','state','person2','P2phone1','P2phone2']].
rename(columns = {'person1' : 'person', 'P2phone1' : 'phone1', 'P2phone2' : 'phone2'), axis = 0)
答案 3 :(得分:1)
这是典型的pd.wide_to_long
问题
试试这个
df=df.rename(columns={'P1phone1':'phone1P1','P1phone2':'phone2P1','P2phone1':'phone1P2','P2phone2':'phone2P2'})
pd.wide_to_long(df,['person','phone1P','phone2P'],i=['id','city','state'],j='age').reset_index().drop('age',1)
Out[364]:
id city state person phone1P phone2P
0 1 Ghut TY Jeff 32131 4324
1 1 Ghut TY Bill 213123 31231
2 2 Ffhs TY Ron 32131 4324
3 2 Ffhs TY Bill 213123 31231
4 3 Tyuf TY Jeff 32131 4324
5 3 Tyuf TY Tyler 213123 31231
答案 4 :(得分:1)
我对此太慢了,但这是我的解决方案:
import pandas as pd
import numpy as np
#define column names for the result df.
# another way to ensure the same column names: colnames = df.columns.values[:6]
colnames = ['identity', 'city', 'state', 'person', 'phone1', 'phone2']
#split the original df into two.
df1 = df.iloc[:, :6]
df2 = pd.concat([df.iloc[:, :3], df.iloc[:, 6:]], axis=1)
#reset the column names so that both dfs have same colnames
df1.columns, df2.columns = colnames, colnames
#concatenate the two
result = pd.concat([df1, df2]).reset_index(drop=True)
identity city state person phone1 phone2
0 1 Ghut TY Jeff 32131 4324
1 2 Ffhs TY Ron 32131 4324
2 3 Tyuf TY Jeff 32131 4324
3 1 Ghut TY Bill 213123 31231
4 2 Ffhs TY Bill 213123 31231
5 3 Tyuf TY Tyler 213123 31231