我正在使用Scrapy和Python从站点抓取数据并将数据存储在csv文件中。然后我试图从csv文件中获取值并尝试将值存储在mysql数据库表中。 insert语句既不会触发错误也不会将任何数据插入数据库。我检查了其值来自csv的字段的数据类型。都是字符串。存储在csv中的所有值都是字符串格式。这就是为什么在db中存储值时,它会为除string / varchar之外的所有数据类型创建问题。我现在应该怎么做? 除了varchar之外,我的数据库表中还有int(6)和timestamp数据类型的列。
导入csv 进口重新 导入pymysql import sys
connection = pymysql.connect (host = "localhost", user = "root", passwd = ".....", db = "city_details")
cursor = connection.cursor ()
def insert_articles2(rows):
rowcount = 0
for row in rows:
if rowcount!= 0:
sql = "INSERT IGNORE INTO articles2 (country, event_name, md5, date_added, profile_image, banner, sDate, eDate, address_line1, address_line2, pincode, state, city, locality, full_address, latitude, longitude, start_time, end_time, description, website, fb_page, fb_event_page, event_hashtag, source_name, source_url, email_id_organizer, ticket_url) VALUES (%s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %d, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s)"
cursor.execute = (sql, (row[0], row[1], row[2], row[3], row[4], row[5], row[6], row[7], row[8], row[9], row[10], row[11], row[12], row[13], row[14], row[15], row[16], row[17], row[18], row[19], row[20], row[21], row[22], row[23], row[24], row[25], row[26], row[27]))
rowcount+=1
rows = csv.reader(open("items.csv", "r"))
insert_articles2(rows)
connection.commit()
articles2
CREATE TABLE IF NOT EXISTS `articles2` (
`id` int(6) NOT NULL AUTO_INCREMENT,
`country` varchar(45) NOT NULL,
`event_name` varchar(200) NOT NULL,
`md5` varchar(35) NOT NULL,
`date_added` timestamp NULL DEFAULT NULL,
`profile_image` varchar(350) NOT NULL,
`banner` varchar(350) NOT NULL,
`sDate` timestamp NOT NULL DEFAULT '0000-00-00 00:00:00',
`eDate` timestamp NOT NULL DEFAULT '0000-00-00 00:00:00',
`address_line1` mediumtext,
`address_line2` mediumtext,
`pincode` int(7) NOT NULL,
`state` varchar(30) NOT NULL,
`city` text NOT NULL,
`locality` varchar(50) NOT NULL,
`full_address` varchar(350) NOT NULL,
`latitude` varchar(15) NOT NULL,
`longitude` varchar(15) NOT NULL,
`start_time` time NOT NULL,
`end_time` time NOT NULL,
`description` longtext CHARACTER SET utf16 NOT NULL,
`website` varchar(50) DEFAULT NULL,
`fb_page` varchar(200) DEFAULT NULL,
`fb_event_page` varchar(200) DEFAULT NULL,
`event_hashtag` varchar(30) DEFAULT NULL,
`source_name` varchar(30) NOT NULL,
`source_url` varchar(350) NOT NULL,
`email_id_organizer` varchar(100) NOT NULL,
`ticket_url` mediumtext NOT NULL,
PRIMARY KEY (`id`),
KEY `full_address` (`full_address`),
KEY `full_address_2` (`full_address`),
KEY `id` (`id`),
KEY `event_name` (`event_name`),
KEY `sDate` (`sDate`),
KEY `eDate` (`eDate`),
KEY `id_2` (`id`),
KEY `country` (`country`),
KEY `event_name_2` (`event_name`),
KEY `sDate_2` (`sDate`),
KEY `eDate_2` (`eDate`),
KEY `state` (`state`),
KEY `locality` (`locality`),
KEY `start_time` (`start_time`),
KEY `start_time_2` (`start_time`),
KEY `end_time` (`end_time`),
KEY `id_3` (`id`),
KEY `id_4` (`id`),
KEY `event_name_3` (`event_name`),
KEY `md5` (`md5`),
KEY `sDate_3` (`sDate`),
KEY `eDate_3` (`eDate`),
KEY `latitude` (`latitude`),
KEY `longitude` (`longitude`),
KEY `start_time_3` (`start_time`),
KEY `end_time_2` (`end_time`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1 AUTO_INCREMENT=4182 ;
答案 0 :(得分:1)
无论这个特定的SQL相关错误,这很可能取决于一些数据不匹配,我强烈建议避免导出到CSV的步骤,而是添加setlist
,这将直接导出您的已删除项目进入MySQL表,然后从那里可以轻松地将日期移动到其他表或处理它......
如果您无法使用管道和/或您想要更具可自定义的内容,那么您可以查看scrapy-mysql-pipeline,并找到有关如何编写自定义mysql的有用信息管道...