我正在处理图像并将其转换为约400个数据值。我希望将这些值中的每一个存储在一列中。我的mysql表有这样的列:
MYID, WIDTH, HEIGHT,P1,P2,P3.....P400.
我可以轻松地将它们保存到一个csv文件中,但是由于处理过程大约发生在300万个文件中,我想我会将这些输出直接写到mysql表中,而不是创建多个csv文件。
这是我到目前为止写的:
for (i, imagePath) in enumerate(imagePaths):
filename = imagePath[imagePath.rfind("/") + 1:]
image = cv2.imread(imagePath)
rows, cols, channels = image.shape
if not image is None:
features = detail.describe(image)
features = [str(x) for x in features]
fileparam = [filename,cols,rows]
sqldata = fileparam+features
var_string = ', '.join('?' * len(sqldata))
query_string = 'INSERT INTO lastoneweeknew VALUES (%s)' % var_string
y.execute(query_string, sqldata)
如果我打印sqldata,它会像这样打印:
['120546506.jpg',650, 420, '0.0', '0.010269055',........., '0.8539078']
mysql表具有以下数据类型:
+----------+----------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+----------+----------------+------+-----+---------+----------------+
| image_id | int(11) | NO | PRI | NULL | auto_increment |
| MYID | int(10) | YES | | NULL | |
| WIDTH | decimal(6,2) | YES | MUL | NULL | |
| HEIGHT | decimal(6,2) | YES | MUL | NULL | |
| P1 | decimal(22,20) | YES | | NULL | |
| P2 | decimal(22,20) | YES | | NULL | |
当我将数据插入mysql表时,出现以下错误:
TypeError: not all arguments converted during string formatting
但是,当我将输出写入csv文件并使用R将csv数据插入mysql时,我可以轻松插入。
我认为行和列的值是整数,其余的看起来像输出中的文本,因此我将它们转换为文本。
row = str(rows)
col = str(cols)
但是我仍然遇到相同的错误。
答案 0 :(得分:0)
对于您的错误-%s只能用于格式化字符串参数,但是您的某些参数是int类型-因此类型错误。
您似乎正在尝试构建数据框并将其上传到MySQL数据库-幸运的是,这是一项常见的任务,因此有一个名为pandas的库可以为您完成所有这些工作。如果您创建字典列表,其中每个字典的键值对都是ColumnName:Value。
import pandas as pd
from pandas.io import sql
import MySQLdb
def handlePaths(imagePaths):
imageDataList = []
for (i, imagePath) in enumerate(imagePaths):
filename = imagePath[imagePath.rfind("/") + 1:]
image = cv2.imread(imagePath)
rows, cols, channels = image.shape
if not image is None:
features = detail.describe(image)
features = [str(x) for x in features]
fileparam = [filename,cols,rows]
sqldata = fileparam+features
imageData = {"MYID" : value,
"WIDTH" : value,
"HEIGHT": value,
"P1": value, #I would do these iterivly
.....,
"P400": value}
imageDataList.append(imageData)
imageDataFrame = pd.DataFrame(imageDataList)
database_connection = MySQLdb.connect() # may need to add some other options to connect
imageDataFrame.to_sql(con=database_connection, name='lastoneweeknew', if_exists='replace')
我认为这是一个非常消耗CPU的过程,您可以为每个CPU分配一个映像,以使其运行更快。通过上传每个单独的条目,您可以让数据库处理竞争条件。
import pandas as pd
from pandas.io import sql
import MySQLdb
import multiprocessing
def analyzeImages(imagePaths) #imagePaths is a list of image paths
pool = multiprocessing.Pool(cpu_count)
pool.map(handleSinglePath, imagePaths)
pool.join()
pool.close()
def handleSinglePath(imagePath):
image = cv2.imread(imagePath) #Not sure what you where doing before here but you can do it again
rows, cols, channels = image.shape
if not image is None:
features = detail.describe(image)
features = [str(x) for x in features]
fileparam = [filename,cols,rows]
sqldata = fileparam+features
imageData = {"MYID" : value,
"WIDTH" : value,
"HEIGHT": value,
"P1": value, #I would do these iterivly
.....,
"P400": value}
imageDataFrame = pd.DataFrame(imageData)
database_connection = MySQLdb.connect() # may need to add some other options to connect
imageDataFrame.to_sql(con=database_connection, name='lastoneweeknew', if_exists='replace')