我正在解析包含我的IPTV播放列表数据的.m3u文件的所有行。我正在寻找隔离并打印格式文件中的字符串部分:
tvg-logo="http//somelinkwithapicture.png"
..在看起来像这样的字符串中
#EXTINF:-1 catchup="default" catchup-source="http://someprovider.tv/play/dvr/${start}/2480.m3u8?token=%^%=&duration=3600" catchup-days=5 tvg-name="Sky Sports Action HD" tvg-id="SkySportsAction.uk" tvg-logo="http://someprovider.tv/logos/sky%20sports%20action%20hd.png" group-title="Sports",Sky Sports Action HD
http://someprovider.tv/play/2480.m3u8?token=465454=
我的课看起来像这样:
import re
class iptv_cleanup():
filepath = 'C:\\Users\\cg371\\Downloads\\vget.m3u'
with open(filepath, "r") as text_file:
a = text_file.read()
b = re.search(r'tvg-logo="(.*?)"', a)
c = b.group()
print c
text_file.close
iptv_cleanup()
尽管返回的只是这样的字符串:
tvg-logo=""
我对正则表达式有些生疏,但我看不出有任何明显的错误。
有人可以协助吗?
谢谢
答案 0 :(得分:0)
选中(?:tvg-logo=\")[\w\W]*(?<=.png)
import re
reg = '(?:tvg-logo=\")[\w\W]*(?<=.png)'
string = '#EXTINF:-1 catchup="default" catchup-source="http://someprovider.tv/play/dvr/${start}/2480.m3u8?token=%^%=&duration=3600" catchup-days=5 tvg-name="Sky Sports Action HD" tvg-id="SkySportsAction.uk" tvg-logo="http://someprovider.tv/logos/sky%20sports%20action%20hd.png" group-title="Sports",Sky Sports Action HD http://someprovider.tv/play/2480.m3u8?token=465454='
print re.findall(reg,string, re.DOTALL)[0]
$python main.py
tvg-logo="http://someprovider.tv/logos/sky%20sports%20action%20hd.png
答案 1 :(得分:0)
最终成功了:
import re
class iptv_cleanup():
filepath = 'C:\\Users\\cg371\\Downloads\\vget.m3u'
with open(filepath, "r") as text_file:
a = text_file.read()
b = re.findall(r'tvg-logo="(.*?)"', a)
for i in b:
print i
text_file.close
iptv_cleanup()
感谢您的输入...