Graph.create_png错误TypeError:序列项0:预期的str实例,找到的字节

时间:2018-08-28 11:44:18

标签: python python-3.x web-scraping graphviz pydot

我正在尝试根据我通过抓取收集到的一些链接来创建图形。如果我只寻找1个标签,那么一切正常,但是如果尝试多个标签,则会出现以下错误:

File "c:\Users\qnour\Desktop\Programming\Python\GettingStarted\Wiki_Scraping.py", line 89, in <module>
    main()
  File "c:\Users\qnour\Desktop\Programming\Python\GettingStarted\Wiki_Scraping.py", line 32, in main
    drawGraph(graph)
  File "c:\Users\qnour\Desktop\Programming\Python\GettingStarted\Wiki_Scraping.py", line 85, in drawGraph
    graph.write_png('wiki_graph.png', prog='dot')
  File "C:\Users\qnour\AppData\Local\Programs\Python\Python36\lib\site-packages\pydot\__init__.py", line 1807, in <lambda>
    lambda path, f=frmt, prog=self.prog : self.write(path, format=f, prog=prog))
  File "C:\Users\qnour\AppData\Local\Programs\Python\Python36\lib\site-packages\pydot\__init__.py", line 1909, in write
    dot_fd.write(self.create(prog, format))
  File "C:\Users\qnour\AppData\Local\Programs\Python\Python36\lib\site-packages\pydot\__init__.py", line 2013, in create
    stderr_output = ''.join(stderr_output)
TypeError: sequence item 0: expected str instance, bytes found

这是代码:

import bs4 as bs
import urllib.request
import pydot
import graphviz
from IPython.display import Image, display
import os 



def viewPydot(pdot):
    plt = Image(pdot.create_png())
    display(plt)

global sauce
global soup

def main():
    global sauce
    global soup
    firstElement = input("Please select the first element : ")
    bareLink = "https://en.wikipedia.org/wiki/"
    sectionNumber = calculateSection(bareLink+firstElement)
    if (sectionNumber == -1):
    print("no see also section ! ")
    exit(0)
url = "https://en.wikipedia.org/w/api.php?action=parse&prop=links&page={}&section={}".format(firstElement, sectionNumber)
sauce = urllib.request.urlopen(url).read()
soup = bs.BeautifulSoup(sauce, 'lxml')
listUrl = gatherLinks()
fullUrl = createNewLinks(listUrl)
graph = createGraph(listUrl, firstElement)
drawGraph(graph)
#TODO 
#ADD THE NEW LINKS TO THE GRAPH

def createNewLinks(listUrl):
    bareLink = "https://en.wikipedia.org/wiki/"
    fullUrl = []
    for item in listUrl:
        fullUrl.append(bareLink + item)
    return fullUrl


def gatherLinks():
    header = soup.find_all("span", class_="s2")
    found_star = False
    listUrl = []

    for item in header:
        if (found_star):
            print(item.text)
            listUrl.append(item.text.split('"')[1])
            found_star = False
        else:
            if (item.text == '"*"'):
                found_star = True
    return listUrl

def createGraph(listUrl, firstElement):
graph = pydot.Dot(graph_type='graph')

for graphEdge in listUrl:
    edge = pydot.Edge(firstElement, graphEdge)
    graph.add_edge(edge)
return graph


def calculateSection(url):
source = urllib.request.urlopen(url).read()
sectionSoup = bs.BeautifulSoup(source, 'lxml')

sections = sectionSoup.findAll(["h2", "h3", "h4"])

for number, item in enumerate(sections):
    print(item.text)
    if (item.text == "See also" or item.text == "See also[edit]"):
        print(number)
        return number
return -1



def drawGraph(graph):
graph.write_png('wiki_graph.png', prog='dot')
Image('wiki_graph.png')

    if __name__=="__main__":
main()

困扰我的是变化:

sections = sectionSoup.findAll(["h2", "h3", "h4"])

作者:

sections = sectionSoup.findAll("h2")

一切正常,但我需要检查所有3个标签。

1 个答案:

答案 0 :(得分:0)

据我所知,您需要类似的东西:

$('.toggle-canvas-menu').click(function() {
  $('body').attr('id', 'msg-body');
  $('#msg-body').toggleClass('open');
  $('.toggle-canvas-menu').toggleClass('open');
});