如何使用python正则表达式库从字符串中查找日期和时间?

时间:2018-05-21 09:43:30

标签: python regex python-3.x

给出一个文字:

start_KA03MM7155_RKMS121MI4-4.21005_NEW_end, 2018-01-02 09:48:23

如何使用python将2018-01-02作为020118提取为09:48:23,将094823提取为另一个变量中的name := "scala_spark_stream_metrices" version := "1.0" scalaVersion := "2.11.8" dependencyOverrides += "com.fasterxml.jackson.core" % "jackson-core" % "2.9.5" dependencyOverrides += "com.fasterxml.jackson.core" % "jackson-databind" % "2.9.5" dependencyOverrides += "com.fasterxml.jackson.module" %% "jackson-module-scala" % "2.9.5" libraryDependencies += "org.apache.spark" %% "spark-core" % "2.3.0" libraryDependencies += "org.apache.spark" %% "spark-sql" % "2.3.0" // https://mvnrepository.com/artifact/org.apache.spark/spark-streaming-kinesis-asl libraryDependencies += "org.apache.spark" %% "spark-streaming-kinesis-asl" % "2.3.0" libraryDependencies += "org.apache.spark" %% "spark-streaming" % "2.3.0" // https://mvnrepository.com/artifact/org.elasticsearch/elasticsearch-spark libraryDependencies += "org.elasticsearch" % "elasticsearch-hadoop" % "6.2.3" libraryDependencies += "com.maxmind.geoip2" % "geoip2" % "2.12.0"

2 个答案:

答案 0 :(得分:2)

如果您在字符串中的日期遵循YYYY-MM-DD或YYYY-MM-DD模式, 用于提取日期字段的代码

import re
text = 'start_KA03MM7155_RKMS121MI4-4.21005_NEW_end, 2018-01-02 09:48:23'
result = re.search('(\d{4}-\d{2}-\d{2})', text).group(0)
print('result: ', result)

result: 2018-01-02
然后你可以操作字符串来获得所需的输出,对于你的情况

split_data = d.split('-') #split the string
date_pattern = split_data[-1] + split_data[-2] + split_data[-3][-2:]
print('date Pattern: ', date_pattern)

date Pattern: 020118
通过正则表达式模式的微小变化,您可以节省时间

time_pattern = re.search('(\d{2}:\d{2}:\d{2})', a).group(0).replace(':', '')
print('time_pattern: ', time_pattern) 

time_pattern: 094823
简要说明:
\d查找数字
\d{4}匹配4位数字 (\d{4}-\d{2}-\d{2})查找具有(4位数) - (2位数) - (2位数)的组 要了解有关正则表达式的更多信息,请遵循official link

答案 1 :(得分:1)

快速 方式,

date = re.sub('-', '', re.findall('\d{4}-\d{2}-\d{2}',a)[0]) # '20180102'
time = re.sub(':', '', re.findall('\d{2}:\d{2}:\d{2}',a)[0]) # '094823'