如果我拥有以下数据,并且将其读入,对于相似的列,我将获得.1或.2的列名。数据如下:
import io
dfff=io.StringIO("""address,phone,name,website,type,address,phone,name,website,type,address,phone,name,type
123 APPLE STREET,555-5555,APPLE STORE,APPLE.COM,BUSINESS,456 peach ave,777-7777,PEACH STORE,PEACH.COM,BUSINESS,789 banana rd,999-9999,banana store,BUSINESS""")
dfff=io.StringIO("""address,phone,name,website,type,address,phone,name,website,type,address,phone,name,type
123 APPLE STREET,555-5555,APPLE STORE,APPLE.COM,BUSINESS,456 peach ave,777-7777,PEACH STORE,PEACH.COM,BUSINESS,789 banana rd,999-9999,banana store,BUSINESS""")
dfff.seek(0)
newdf2=pd.read_csv(dfff)
这是输出,pandas将列重命名为具有相似列名的.1或.2。
newdf2
# address phone name website type address.1 phone.1 name.1 website.1 type.1 address.2 phone.2 name.2 type.2
#0 123 APPLE STREET 555-5555 APPLE STORE APPLE.COM BUSINESS 456 peach ave 777-7777 PEACH STORE PEACH.COM BUSINESS 789 banana rd 999-9999 banana store BUSINESS
如何将类似地址行合并到单独的行中,以获取此输出(由于没有website.2,它将为NaN或0或空白):
# address phone name website type
#0 123 APPLE STREET 555-5555 APPLE STORE APPLE.COM BUSINESS
#1 456 peach ave 777-7777 PEACH STORE PEACH.COM BUSINESS
#2 789 banana rd 999-9999 banana store NaN BUSINESS
现在,我真的没有从哪里开始,但是我尝试堆叠数据,该数据可以按预期工作,但是拆栈只会恢复到原始数据:
newdf2.stack().to_frame()
# 0
#0 address 123 APPLE STREET
# phone 555-5555
# name APPLE STORE
# website APPLE.COM
# type BUSINESS
# address.1 456 peach ave
# phone.1 777-7777
# name.1 PEACH STORE
# website.1 PEACH.COM
# type.1 BUSINESS
# address.2 789 banana rd
# phone.2 999-9999
# name.2 banana store
# type.2 BUSINESS
我在想必须有一种方法可以堆叠,从列中删除。,然后堆叠为我想要的格式?也许还有另一种方法?
答案 0 :(得分:1)
您可以使用wide_to_long。
df.columns = [f'{x}.0' if '.' not in x else x for x in df.columns]
df['id'] = df.index
df = pd.wide_to_long(df, stubnames=['address', 'phone', 'name', 'website', 'type'], i='id', j='row', sep='.')
df.reset_index(drop=True)
Out[1]:
address phone name website type
0 123 APPLE STREET 555-5555 APPLE STORE APPLE.COM BUSINESS
1 456 peach ave 777-7777 PEACH STORE PEACH.COM BUSINESS
2 789 banana rd 999-9999 banana store NaN BUSINESS