我正在开发一个简单的Web Scrape,DataFrame项目。我有一个简单的8x1 DataFrame,我试图将其拆分为8x2 DataFrame。到目前为止,这就是我的DataFrame的样子:
dframe = DataFrame(data, columns=['Active NPGL Teams'], index=[1, 2, 3, 4, 5, 6, 7, 8])
Active NPGL Teams
1 Baltimore Anthem (2015–present)
2 Boston Iron (2014–present)
3 DC Brawlers (2014–present)
4 Los Angeles Reign (2014–present)
5 Miami Surge (2014–present)
6 New York Rhinos (2014–present)
7 Phoenix Rise (2014–present)
8 San Francisco Fire (2014–present)
我想添加一个专栏," Years Active"并拆分"(2014年至今)","(2015年至今)"进入"岁月活跃"柱。如何拆分数据?
答案 0 :(得分:2)
您可以使用
dframe['Active NPGL Teams'].str.split(r' (?=\()', expand=True)
0 1
1 Baltimore Anthem (2015–present)
2 Boston Iron (2014–present)
3 DC Brawlers (2014–present)
4 Los Angeles Reign (2014–present)
5 Miami Surge (2014–present)
6 New York Rhinos (2014–present)
7 Phoenix Rise (2014–present)
8 San Francisco Fire (2014–present)
关键是正则表达式r' (?=\()'
,只有在后面跟一个空括号(前瞻断言)时才匹配空格。
另一种方法(约5%慢但更灵活)是用户Series.str.extract
。
dframe['Active NPGL Teams'].str.extract(r'^(?P<Team>.+) (?P<YearsActive>\(.+\))$',
expand=True)
Team YearsActive
1 Baltimore Anthem (2015–present)
2 Boston Iron (2014–present)
3 DC Brawlers (2014–present)
4 Los Angeles Reign (2014–present)
5 Miami Surge (2014–present)
6 New York Rhinos (2014–present)
7 Phoenix Rise (2014–present)
8 San Francisco Fire (2014–present)