无法从scrapy项目中将数据插入到sql表中

时间:2018-06-08 10:08:35

标签: python postgresql scrapy

这些是错误:

[scrapy.core.scraper] ERROR: Error processing {'level': None,
 'school': 'Some school name',
 'place': None,
 'subject': None}
Traceback (most recent call last):
  File "/home/reducedgosling/.virtualenvs/data/lib/python3.6/site-packages/twisted/internet/defer.py", line 653, in _runCallbacks
    current.result = callback(current.result, *args, **kw)
  File "/home/reducedgosling/Programming/schools/pipelines.py", line 28, in process_item
    self.cur.execute(sql, data)
psycopg2.InternalError: current transaction is aborted, commands ignored until end of transaction block

items.py

class SchoolsItem(scrapy.Item):
    subject = scrapy.Field()
    level = scrapy.Field()
    place = scrapy.Field()
    school = scrapy.Field()

spider.py

def parse_school(self, response):
    item = SchoolsItem()
    school = response.css('h1 span.title::text').extract_first()
    table_rows = response.css('tr')
    for x in table_rows:
        item['subject'] = x.css('td.views-field-title a::text').extract_first()
        item['level'] = x.css('td.views-field-field-level').xpath('normalize-space(./text())').extract_first()
        item['place'] = x.css('td.views-field-field-campus').xpath('normalize-space(./text())').extract_first()
        item['school'] = school
        yield item

pipelines.py

def process_item(self, item, spider):
    sql = "INSERT INTO udir_content (subject, level, school, place) VALUES (%s, %s, %s, %s);"
    data = (item['subject'], item['level'], item['school'], item['place'])
    self.cur.execute(sql, data)
    self.connection.commit()
    return item

我做错了什么?我怀疑Null值,Python(或psycopg?)转换为None值?但是PostgreSQL接受空值,除非我指定NOT NULL,对吗?

psql日志文件中显示的第一个错误是:

ERROR:  relation "udir_content" does not exist at character 13
STATEMENT:  INSERT INTO udir_content (subject, level, school, place) VALUES (NULL, NULL, 'Some school name', NULL);

其他人只是说"交易中止"。

2 个答案:

答案 0 :(得分:0)

您的INSERT有问题,报告的错误是因为插入失败并且您在发出另一个之前没有回滚该事务。

execute应使用问号作为占位符(?)而不是%s

因此,对于您的陈述,您应该使用

sql = "INSERT INTO udir_content (subject, level, school, place) VALUES (?, ?, ?, ?);"

答案 1 :(得分:0)

通过花费两个小时才弄清楚我在postgres数据库中创建了表格来了解PostgreSQL的一些方法。不是我为此创建的那个。然后我不得不授予我的第二个用户权限,现在一切都很好。