Question

我正在尝试拆分/提取“时间”列的一部分，因此它只会显示小时和分钟，例如18:15与18:15:34相反。

我在网上看到了很多示例，这些示例使用.str.split（）函数，并突出显示冒号作为分隔符。但这会将“时间”列分为三列：小时，分钟，秒。

输入数据框：

df =

Index   Time
0       18:15:21
1       19:15:21
2       20:15:21
3       21:15:21
4       22:15:21

输出数据框

df =

Index   Time
0       18:15
1       19:15
2       20:15
3       21:15
4       22:15

谢谢:)

Answer 1

您可以使用正则表达式：

df.Time.str.replace(':\d\d$', '')

或反向拆分：

df.Time.str.rsplit(':', 1).str[0]

Answer 2

您可以使用：

df['Time'].apply(lambda x : ':'.join(x.split(':')[0:2]))

Answer 3

在replace，extract或split中，pandas.series.str的选择比较合理

首先，这只是基于案例的解决方案。

下面的解决方案将:列中的最后两个数字与Time一起替换。

>>> df['Time'] = df['Time'].str.replace(':\d{2}$', '')
>>> df
    Time
0  18:15
1  19:15
2  20:15
3  21:15
4  22:15

使用str.extract和正则表达式的第二种方法。

>>> df['Time'] = df['Time'].str.extract('(\d{2}:\d{2})')
>>> df
    Time
0  18:15
1  19:15
2  20:15
3  21:15
4  22:15

\d{2} to hold initial two numbers

: next to match this immediately after first match

\d{2} again next two number followed by colon

$ asserts position at the end of a line

在数据框中拆分/提取列的一部分-python

3 个答案: