我有一个文件,其内容为:
01009700 Samsung Samsung SGH-N625 GSM 1900,GSM 900
01009800 Motorola Motorola T194 EOTD GSM 1900
01009900 Option International
,GSM 900
01009901 Option International
,GSM 1900,GSM 900 01009902 Option International ,GSM 1900,GSM 900 01009903 Option International ,GSM 1900,GSM 900 01009904 Option International ,GSM 1900,GSM 900 01009905 Option International ,GSM 1900,GSM 900 01009906 Option International ,GSM 1900,GSM 900 01009907 Option International ,GSM 1900,GSM 900 01009908 Option International ,GSM 1900,GSM 900 01009909 Option International ,GSM 1900,GSM 900 01009910 Option International ,GSM 1900,GSM 900 01009911 Option International ,GSM 1900,GSM 900 01009912 Option International ,GSM 1900,GSM 900 01009913 Option International ,GSM 1900,GSM 900 01009914 Option International ,GSM 1900,GSM 900 01009915 Option International ,GSM 1900,GSM 900 01009916 Option International ,GSM 1900,GSM 900 01009917 Option International ,GSM 1900,GSM 900 01009918 Option International ,GSM 1900,GSM 900 01009919 Option International ,GSM 1900,GSM 900
Option Internati. Globetrotter GSM 1800 Option Internati. Globetrotter GSM 1800 Option Internati. Globetrotter GSM 1800 Option Internati. Globetrotter GSM 1800 Option Internati. Globetrotter GSM 1800 Option Internati. Globetrotter GSM 1800 Option Internati. Globetrotter GSM 1800 Option Internati. Globetrotter GSM 1800 Option Internati. Globetrotter GSM 1800 Option Internati. Globetrotter GSM 1800 Option Internati. Globetrotter GSM 1800 Option Internati. Globetrotter GSM 1800 Option Internati. Globetrotter GSM 1800 Option Internati. Globetrotter GSM 1800 Option Internati. Globetrotter GSM 1800 Option Internati. Globetrotter GSM 1800 Option Internati. Globetrotter GSM 1800 Option Internati. Globetrotter GSM 1800 Option Internati. Globetrotter GSM 1800 Option Internati. Globetrotter GSM 1800
01010000 Sierra Wireless Sierra Wireless Aircard 710 GSM 1900
01010100 Sierra Wireless Sierra Wireless Aircard 750 GSM 1800,GSM 190
0,GSM 900
我使用正则表达式,尝试从8位数字中提取任何内容,并在第一次出现GSM之前提取任何内容,例如:
01009700 Samsung Samsung SGH-N625
01009800 Motorola Motorola T194 EOTD
01009900 Option International
01009902 Option International
01009919 Option International
01010000 Sierra Wireless Sierra Wireless Aircard
01010100 Sierra Wireless Sierra Wireless Aircard
我尝试了\d{8}.+(GSM)?
,但似乎不起作用。
什么是正确的正则表达式?
答案 0 :(得分:4)
您可以使用
re.findall(r'\b(\d{8}.*?)\W*GSM', s)
请参见regex demo
详细信息
\b
-单词边界((\d{8}.*?)
-组1:八位数字,然后除换行符外的任何0+字符应尽可能少\W*
-任意0+个非单词字符GSM
-一个GSM
子字符串。import re
s="""01009700 Samsung Samsung SGH-N625 GSM 1900,GSM 900
01009800 Motorola Motorola T194 EOTD GSM 1900
01009900 Option International
,GSM 900
01009901 Option International
,GSM 1900,GSM 900 01009902 Option International ,GSM 1900,GSM 900 01009903 Option International ,GSM 1900,GSM 900 01009904 Option International ,GSM 1900,GSM 900 01009905 Option International ,GSM 1900,GSM 900 01009906 Option International ,GSM 1900,GSM 900 01009907 Option International ,GSM 1900,GSM 900 01009908 Option International ,GSM 1900,GSM 900 01009909 Option International ,GSM 1900,GSM 900 01009910 Option International ,GSM 1900,GSM 900 01009911 Option International ,GSM 1900,GSM 900 01009912 Option International ,GSM 1900,GSM 900 01009913 Option International ,GSM 1900,GSM 900 01009914 Option International ,GSM 1900,GSM 900 01009915 Option International ,GSM 1900,GSM 900 01009916 Option International ,GSM 1900,GSM 900 01009917 Option International ,GSM 1900,GSM 900 01009918 Option International ,GSM 1900,GSM 900 01009919 Option International ,GSM 1900,GSM 900
Option Internati. Globetrotter GSM 1800 Option Internati. Globetrotter GSM 1800 Option Internati. Globetrotter GSM 1800 Option Internati. Globetrotter GSM 1800 Option Internati. Globetrotter GSM 1800 Option Internati. Globetrotter GSM 1800 Option Internati. Globetrotter GSM 1800 Option Internati. Globetrotter GSM 1800 Option Internati. Globetrotter GSM 1800 Option Internati. Globetrotter GSM 1800 Option Internati. Globetrotter GSM 1800 Option Internati. Globetrotter GSM 1800 Option Internati. Globetrotter GSM 1800 Option Internati. Globetrotter GSM 1800 Option Internati. Globetrotter GSM 1800 Option Internati. Globetrotter GSM 1800 Option Internati. Globetrotter GSM 1800 Option Internati. Globetrotter GSM 1800 Option Internati. Globetrotter GSM 1800 Option Internati. Globetrotter GSM 1800
01010000 Sierra Wireless Sierra Wireless Aircard 710 GSM 1900
01010100 Sierra Wireless Sierra Wireless Aircard 750 GSM 1800,GSM 190
0,GSM 900 """
print(re.findall(r"\b(\d{8}.*?)\W*GSM", s))
输出:
['01009700 Samsung Samsung SGH-N625', '01009800 Motorola Motorola T194 EOTD', '01009900 Option International', '01009901 Option International', '01009902 Option International', '01009903 Option International', '01009904 Option International', '01009905 Option International', '01009906 Option International', '01009907 Option International', '01009908 Option International', '01009909 Option International', '01009910 Option International', '01009911 Option International', '01009912 Option International', '01009913 Option International', '01009914 Option International', '01009915 Option International', '01009916 Option International', '01009917 Option International', '01009918 Option International', '01009919 Option International', '01010000 Sierra Wireless Sierra Wireless Aircard 710', '01010100 Sierra Wireless Sierra Wireless Aircard 750']