Question

我有一个字符串，有三种形式：

XhYmZs or YmZs or Zs

其中，h，m，s是小时，分钟，秒和X，Y，Z是相应的值。

如何在python2.7中有效地将这些字符串转换为秒？

我想我可以做类似的事情：

s="XhYmZs"
if "h" in s:
    hours=s.split("h")
elif "m" in s:
    mins=s.split("m")[0][-1]

...但这对我来说似乎没有效率:(

Answer 1

拆分您感兴趣的分隔符，然后将每个结果元素解析为整数并根据需要进行乘法运算：

import re
def hms(s):
    l = list(map(int, re.split('[hms]', s)[:-1]))
    if len(l) == 3:
        return l[0]*3600 + l[1]*60 + l[2]
    elif len(l) == 2:
        return l[0]*60 + l[1]
    else:
        return l[0]

这会产生一个标准化为秒的持续时间。

>>> hms('3h4m5s')
11045
>>> 3*3600+4*60+5
11045
>>> hms('70m5s')
4205
>>> 70*60+5
4205
>>> hms('300s')
300

你也可以通过转动re.split()结果并将60倍增加到基于元素在列表中的位置的递增功率来制作这一行：

def hms2(s):
    return sum(int(x)*60**i for i,x in enumerate(re.split('[hms]', s)[-2::-1]))

Answer 2

不知道这是多么有效，但我就是这样做的：

import re

test_data = [
    '1h2m3s',
    '1m2s',
    '1s',
    '3s1h2m',
]


HMS_REGEX = re.compile('^(\d+)h(\d+)m(\d+)s$')
MS_REGEX = re.compile('^(\d+)m(\d+)s$')
S_REGEX = re.compile('^(\d+)s$')


def total_seconds(hms_string):
    found = HMS_REGEX.match(hms_string)
    if found:
        x = found.group(1)
        return 3600 * int(found.group(1)) + \
               60 * int(found.group(2)) + \
               int(found.group(3))

    found = MS_REGEX.match(hms_string)
    if found:
        return 60 * int(found.group(1)) + int(found.group(2))

    found = S_REGEX.match(hms_string)
    if found:
        return int(found.group(1))

    raise ValueError('Could not convert ' + hms_string)

for datum in test_data:
    try:
        print(total_seconds(datum))
    except ValueError as exc:
        print(exc)

或者在TigerhawkT3的一个班轮上进行单一比赛和重复训练，但保留对不匹配字符串的错误检查：

HMS_REGEX = re.compile('^(\d+)h(\d+)m(\d+)s$|^(\d+)m(\d+)s$|^(\d+)s$')

def total_seconds(hms_string):
    found = HMS_REGEX.match(hms_string)
    if found:
        return sum(
            int(x or 0) * 60 ** i for i, x in enumerate(
                (y for y in reversed(found.groups()) if y is not None))

    raise ValueError('Could not convert ' + hms_string)

Answer 3

>>> import datetime
>>> datetime.datetime.strptime('3h4m5s', '%Hh%Mm%Ss').time()
datetime.time(3, 4, 5)

由于它会改变字符串中的哪些字段，因此您可能需要构建匹配的格式字符串。

>>> def parse(s):
...   fmt=''.join('%'+c.upper()+c for c in 'hms' if c in s)
...   return datetime.datetime.strptime(s, fmt).time()

datetime模块是处理时间的标准库方式。

要求“有效”地做这件事有点傻瓜的差事。解释语言中的字符串解析速度不快;旨在澄清。另外，看似高效并不是很有意义;要么分析算法，要么分析基准，否则就是猜测。

Answer 4

我的同胞pythonistas，请停止对所有内容使用正则表达式。这样的简单任务不需要正则表达式。 Python被认为是一种慢速语言，不是因为GIL或解释器，因为这种误用。

In [1]: import re
   ...: def hms(s):
   ...:     l = list(map(int, re.split('[hms]', s)[:-1]))
   ...:     if len(l) == 3:
   ...:         return l[0]*3600 + l[1]*60 + l[2]
   ...:     elif len(l) == 2:
   ...:         return l[0]*60 + l[1]
   ...:     else:
   ...:         return l[0]

In [2]: %timeit hms("6h7m8s")
5.62 µs ± 722 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)

In [6]: def ehms(s):
   ...:    bases=dict(h=3600, m=60, s=1)
   ...:    secs = 0
   ...:    num = 0
   ...:    for c in s:
   ...:        if c.isdigit():
   ...:            num = num * 10 + int(c)
   ...:        else:
   ...:            secs += bases[c] * num
   ...:            num = 0
   ...:    return secs

In [7]: %timeit ehms("6h7m8s")
2.07 µs ± 70.3 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)

In [8]: %timeit hms("8s")
2.35 µs ± 124 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)

In [9]: %timeit ehms("8s")
1.06 µs ± 118 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)

In [10]: bases=dict(h=3600, m=60, s=1)

In [15]: a = ord('a')

In [16]: def eehms(s):
    ...:    secs = 0
    ...:    num = 0
    ...:    for c in s:
    ...:        if c.isdigit():
    ...:            num = num * 10 + ord(c) - a
    ...:        else:
    ...:            secs += bases[c] * num
    ...:            num = 0
    ...:    return secs

In [17]: %timeit eehms("6h7m8s")
1.45 µs ± 30 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)

看到的快了将近四倍。

Answer 5

有一个库python-dateutil-pip install python-dateutil，它需要一个字符串并返回一个datetime.datetime。

它可以将值解析为5h 30m，0.5h 30m，0.5h-带空格或不带空格。

from datetime import datetime
from dateutil import parser


time = '5h15m50s'
midnight_plus_time = parser.parse(time)
midnight: datetime = datetime.combine(datetime.today(), datetime.min.time())
timedelta = midnight_plus_time - midnight
print(timedelta.seconds)  # 18950

它一次解析不能超过24小时。

在python中将时间字符串XhYmZs转换为秒

5 个答案: