beautifulsoup导致字典存储到数据库

时间:2017-09-04 15:22:08

标签: python-3.x dictionary beautifulsoup mysql-python

我正在抓取一个网站并将结果存储在嵌套字典中。 字典与我的数据库具有相同的结构。 我的目标是编写一个带有一个参数的函数,该函数保存表名并将字典中的数据插入到该表中。

我有以下代码

If PlayersLives = 0 Then
    End If
    PlayerName = InputBox("Enter your username ")
    lblName.Text = PlayerName
    form3.AddScoreIfHighEnough(lblName.Text, CInt(lblScoreNumber.Text))
    Form3.lblPlayer.Text = lblName.Text

字典中的第一个键" band"被命名为我的数据库中的表。嵌套键是该表内的列。我写的函数取决于它插入正确值的参数。

我收到此错误:

url = requests.get("http://www.randomurl.com")
data = url.text
soup = BeautifulSoup(data, "html5lib")

cnx = pymysql.connect(host='localhost',
                  user='root',
                  password='',
                  database='mydb')

cursor = cnx.cursor()

band = {    
            "band_info":    {
                            "band_name" : soup.find('h1', {'class': 'band_name'}).get_text(),
                            "band_logo" : soup.find('a', {'id': 'logo'})['href'],
                            "band_img" : soup.find('a', {'id': 'photo'})['href'],
                            "band_comment" : soup2.find('body').get_text().replace('\r', '').replace('\n', '').replace('\t', '').strip()
                            },
            "countries":    {
                            "country" : "value",
                            },
            "locations":    {
                            "location" : "value",
                            },
            "status":       {
                            "status_name" : "value",
                            },
            "formedin":     {
                            "formed_year" : "value",
                            },
            "genres":       {
                            "genre_name" : ["value","value","value"]
                            },
            "lyricalthemes":{
                            "theme_name" : ["value","value","value"]
                            },
            "labels":       {
                            "label_name" : ["value","value","value"]
                            },
            "activeyears":  {
                            "active_year" : "value"
                            },
            "discography":  {
                            "album_name" : ["value","value","value"]
                            },
            "artists":      {
                            "artist_name" : ["value","value","value"]
                            }
        }

def insertData(table):
    placeholders = ', '.join(['%s'] * len(band[table]))
    columns = ', '.join(band[table].keys())
    values = band[table].values()
    sql = "INSERT INTO %s ( %s ) VALUES ( %s )" % (table, columns, placeholders)
    print(sql)
    cursor.execute(sql, values)


insertData("band_info")

cursor.close()
cnx.close()

我对此有点失落。我将this作为我代码的参考。

我的问题是,我是否需要在beautifulsoup结果上进行某种文本编码才能将其存储在数据库中?如果没有,我怎样才能正确地将数据插入我的mysql数据库?

我对同一主题还有其他问题。

我的下一步是在其他表中插入关系。 我只是尝试执行此代码:

Traceback (most recent call last):
File "parser.py", line 144, in <module>
insertData("band_info")
File "parser.py", line 141, in insertData
cursor.execute(sql, values)
File "\Python\Python36-32\lib\site-packages\pymysql\cursors.py", line 164, in execute
query = self.mogrify(query, args)
File "\Python\Python36-32\lib\site-packages\pymysql\cursors.py", line 143, in mogrify
query = query % self._escape_args(args, conn)
File "\Python\Python36-32\lib\site-packages\pymysql\cursors.py", line 129, in _escape_args
return conn.escape(args)
File "\Python\Python36-32\lib\site-packages\pymysql\connections.py", line 814, in escape
return escape_item(obj, self.charset, mapping=mapping)
File "\Python\Python36-32\lib\site-packages\pymysql\converters.py", line 27, in escape_item
val = encoder(val, mapping)
File "\Python\Python36-32\lib\site-packages\pymysql\converters.py", line 110, in escape_unicode
return u"'%s'" % _escape_unicode(value)
File "\Python\Python36-32\lib\site-packages\pymysql\converters.py", line 73, in _escape_unicode
return value.translate(_escape_table)
AttributeError: 'dict_values' object has no attribute 'translate'

我得到一个非常相似的错误代码,但我无法弄清楚数据类型有什么问题:

for i in band["artists"]["artist_name"]:
    cursor.execute("""INSERT INTO `band_artists` ( `id_aband` , `id_aartist` ) VALUES (
                (SELECT  `id_band`  from `band_info` WHERE `band_name` = ? AND WHERE band_logo = ? ),
                (SELECT  `id_art`  from `artists` WHERE `artist_name` = ? ) )""",(band["band_info"]["band_name"], band["band_info"]["band_logo"], i))
    cnx.commit()

我尝试写前面提到的列表(值,值),我得到同样的错误。

1 个答案:

答案 0 :(得分:1)

问题是你在dict_values的第二个参数上传递execute()value只接受元组,列表或字典。你可以试试这个:

def insertData(table):
    placeholders = ', '.join(['%s'] * len(band[table]))
    columns = ', '.join(band[table].keys())
    values = list(band[table].values()) # I edited this part
    sql = "INSERT INTO %s ( %s ) VALUES ( %s )" % (table, columns, placeholders)
    print(sql)
    cursor.execute(sql, values)