我在数据框中有一个列,我想在遇到第一个数值的地方将其拆分。这是我的数据示例:
col
1 Beb il Gisire, contrata 102
12 Bungemma, territorium 90, 115, 130
13 Territorium Binhise 188
14 Contrata Bir Bahar 205
15 Contrata Bir HaJar 168
16 Bir Kibir, contrata 7
17 Lu Burgu; Suburbium Castri Maris 5, 15, 23, 6...
随着空间的变化,我无法按空格或数字进行划分。所需的输出是:
1 Beb il Gisire, contrata 102
12 Bungemma, territorium 90, 115, 130
13 Territorium Binhise 188
14 Contrata Bir Bahar 205
15 Contrata Bir HaJar 168
16 Bir Kibir, contrata 7
17 Lu Burgu; Suburbium Castri Maris 5, 15, 23, 6...
答案 0 :(得分:3)
使用'(.*?)(\d.*)'
正则表达式模式捕获/拆分组。
In [237]: df.col.str.extract('(.*?)(\d.*)')
Out[237]:
0 1
1 Beb il Gisire, contrata 102
12 Bungemma, territorium 90, 115, 130
13 Territorium Binhise 188
14 Contrata Bir Bahar 205
15 Contrata Bir HaJar 168
16 Bir Kibir, contrata 7
17 Lu Burgu; Suburbium Castri Maris 5, 15, 23, 6...
答案 1 :(得分:1)
一个选项是:
df['col1'] = df['col'].str.split('(\d)').str[0]
df['col2'] = df['col'].replace(to_replace=r'\b'+df['col1']+r'\b', value='',regex=True)
输出:
col1 col2
0 Beb il Gisire, contrata 102
1 Bungemma, territorium 90, 115, 130
2 Territorium Binhise 188
3 Contrata Bir Bahar 205
4 Contrata Bir HaJar 168
5 Bir Kibir, contrata 7
6 Lu Burgu; Suburbium Castri Maris 5, 15, 23, 6...
。