import re,urllib
[('clown.gif', 'gif'), ('sleeper.jpg', 'jpg'), ('StarWarsReview.docx', 'docx'), ('wargames.jpg', 'jpg'), ('nothingtoseehere.docx', 'docx'), ('starwars.jpg', 'jpg'), ('logo.jpg', 'jpg'), ('certified.jpg', 'jpg'), ('clown.gif', 'gif'), ('essays.gif', 'gif'), ('big.jpg', 'jpg'), ('Doc100.docx', 'docx'), ('FavRomComs.docx', 'docx'), ('python.bmp', 'bmp'), ('dingbat.jpg', 'jpg')]
运行此代码后,我的正则表达式出现问题,因此答案如下:
('clown.gif', 'gif')
我不希望结果如此['clown.gif','sleeper.jpg']
所有我想要的就像是$ffmpeg = "/usb/bin/local/ffmpeg";
$videos = "/videos/*.mp4";
$ouput_path = "/videos/thumbnails/";
foreach(glob($videos) as $video_file){
$lfilename = basename($video_file);
$filename = basename($video_file, ".mp4");
$thumbnail = $ouput_path.$filename.'.jpg';
if (!file_exists($filename)) {
#$thumbnail = str_replace("'", "%27", $thumbnail);
exec("/usr/local/bin/ffmpeg -i '$video_file' -an -y -f mjpeg -ss 00:00:30 -vframes 1 '$thumbnail'");
}
echo "<a href='$lfilename'>$filename<img src='thumbnails/$filename.jpg' width='350'>";
等等
反正有吗?并得到红色的元组??
答案 0 :(得分:1)
您只需将您的群组变为a non-capturing group。
def get_files(page):
a = urllib.urlopen(page)
b = a.read()
c = re.findall("([a-zA-Z0-9]+\.{1}(?:jpg|bmp|docx|gif))", b)
答案 1 :(得分:0)
您正在对扩展程序进行双重捕获,请尝试使用下面的正则表达式,?:
表示非捕获组
re.findall("([a-zA-Z0-9]+\.{1}(?:jpg|bmp|docx|gif))", b)
我将您的正则表达式简化为以下,{1}
似乎是多余的,并使用\w
和\d
作为单词和数字组
re.findall("([\w\d]+\.(?:jpg|bmp|docx|gif))", b)