我的数据:
Page
www.google/somedata/1514
www.google/somedata/8249984
我想要的是:
Page TBID
www.google/somedata/1514 1514
www.google/somedata/8249984 8249984
我的代码:
import pandas as pd
# intialise data of lists.
data = {'Page':['www.google/somedata/1514', 'www.google/somedata/8249984']}
# Create DataFrame
df = pd.DataFrame(data)
# Print the output.
df['TBID'] = df['Page'].str.extract('(\d*)', expand=True)
df
它显示空白数据,不确定为什么吗?
答案 0 :(得分:1)
使用\d+
匹配所有数字,使用expand=False
返回Series
:
df['TBID'] = df['Page'].str.extract('(\d+)', expand=False)
print (df)
Page TBID
0 www.google/somedata/1514 1514
1 www.google/somedata/8249984 8249984