Question

wget在＆符号之后跳过所有内容（＆amp;）我试图逃避\＆amp;但它无法正常工作

代码：

import threading
import urllib.request
import os
import re
import time
import json
import sys

def take():
    a = ["https://itunes.apple.com/us/genre/ios-games-action/id7001?mt=8&letter=A","https://itunes.apple.com/us/genre/ios-games-action/id7001?mt=8&letter=B"]
    for url_file in a:
        url_file = re.sub(r'\&','\&',url_file)
        data = os.popen('wget -qO- %s'% url_file).read()
        if re.search(r'(?mis)paginate\-more\">next',data):
            print ("hi")


take()

这应打印"hi"

但是因为wget跳过了＆amp;这是空白输出。

怎么能让这个工作？

Answer 1

你面临的问题是&在shell中有一个特殊的含义（你通过popen调用一个shell）：那就是在左边的工作背景＆符号。

要绕过这一点，你必须转义特殊字符，或在URL周围使用引号：

 data = os.popen('wget -qO- "%s"' % url_file).read()

Answer 2

您的代码正在为我工作。我在Linux上使用python 2.6.x.你能提一下你正在使用哪个python verison吗？你在运行Windows / Linux吗？

输出

hi
hi

我看到你已经逃过了'＆amp;'在你的来源。

wget在＆符号（＆amp;）结束并在此之后跳过eveything

2 个答案: