Question

我试图通过Python获取特定用户的所有Instagram帖子。在我的代码下面：

awk 'NR==1{hdr=$0; next}
     {fn="file1_" $3; if (p != $3) {close(p); p=$3; print hdr > fn} print > fn}
     END {close(p)}' file1

但是，我收到了错误：

import requests
from bs4 import BeautifulSoup


def get_images(user):
    url = "https://www.instagram.com/" + str(user)
    source_code = requests.get(url)
    plain_text = source_code.text
    soup = BeautifulSoup(plain_text)
    for image in soup.findAll('img'):
        href = image.get('src')
        print(href)

get_images('instagramuser')

所以我的问题是，我做错了什么？

Answer 1

您应该将解析器传递给BeautifulSoup，这不是错误，只是警告。

soup = BeautifulSoup(plain_text, "html.parser")

Answer 2

soup = BeautifulSoup(plain_text,'lxml')

我建议使用＆gt; lxml ＆lt;而不是＆gt; html.parser ＆lt;

而不是requests.get使用 urlopen

这里的代码全部在一行

来自urllib导入请求的

来自bs4 import BeautifulSoup

def get_images(user):

    soup = BeautifulSoup(request.urlopen("https://www.instagram.com/"+str(user)),'lxml')
    for image in soup.findAll('img'):
        href = image.get('src')
        print(href)
get_images('user')

使用Python抓取Instagram Feed

2 个答案: