Question

我有一个小问题。有一个网站：https://www.regelleistung.net/ip/action/ausschreibung/public我想从中解析一个链接。如果您在3个字段中输入值：

“Auschreibung ab”，“Produktart”，dienächsten“

你得到一张我希望得到下载链接的表格（有一张带箭头的蓝色图片）。

我尝试了什么：

payload = {
        "suche.searchFrom":"03.12.2014",
        "suche.produktartSelection": "2",
        "suche.resultLimit":"10",
    }

headers = {
        'Referer': 'https://www.regelleistung.net/ip/action/ausschreibung/public',
        'User-Agent': 'Mozilla/5.0 (Windows NT 5.1; rv:33.0) Gecko/20100101 Firefox/33.0'
        }
with session() as c:
    c.post('https://www.regelleistung.net/ip/action/ausschreibung/public', data=payload,     verify=False, headers=headers)
    request= c.get('https://www.regelleistung.net/ip/action/ausschreibung/public') 

    html= str(request.content)
    result = re.search('<a class="button" href="/ip/action/ausschreibung/public?(.*)"><img    border="0" src="/ip/img/icon_download.png" ',html)
    result.group(1)

然而，似乎右边的链接不在request.content中。所以我认为我的“c.post”错了，但我不知道为什么以及如何检查。

您可能想知道我为什么不使用beautifulsoup但是我遇到了代理问题，并尝试使用Anaconda中提供的软件包。

如果有人能给我一个提示，我在post方法中做了什么，那会很酷。其次，如果有人知道从request.content中提取链接的更复杂方法，我也会感激。

谢谢！

python在post方法后从网站获取链接

0 个答案: