如何通过mysqldb将pandas数据帧插入数据库?

时间:2013-05-10 06:29:11

标签: python mysql pandas mysql-python

我可以从python连接到我的本地mysql数据库,我可以创建,选择和插入单独的行。

我的问题是:我可以直接指示mysqldb获取整个数据帧并将其插入现有表中,还是需要迭代行?

在任何一种情况下,对于一个包含ID和两个数据列以及匹配数据帧的非常简单的表,python脚本会是什么样子?

9 个答案:

答案 0 :(得分:73)

更新

现在有to_sql方法,这是执行此操作的首选方法,而不是write_frame

df.to_sql(con=con, name='table_name_for_df', if_exists='replace', flavor='mysql')

另请注意:pandas 0.14 ...

中的语法可能会发生变化

您可以使用MySQLdb设置连接:

from pandas.io import sql
import MySQLdb

con = MySQLdb.connect()  # may need to add some other options to connect

flavor的{​​{1}}设置为write_frame意味着您可以写入mysql:

'mysql'

参数sql.write_frame(df, con=con, name='table_name_for_df', if_exists='replace', flavor='mysql') 告诉pandas如果表已存在则如何处理:

  

if_exists,默认if_exists: {'fail', 'replace', 'append'}
      'fail':如果表存在,则不执行任何操作       fail:如果表存在,请删除它,重新创建它,然后插入数据       replace:如果表存在,则插入数据。如果不存在则创建。

虽然write_frame docs目前建议它只适用于sqlite,但似乎支持mysql,实际上有很多mysql testing in the codebase

答案 1 :(得分:9)

Andy Hayden提到了正确的功能(to_sql)。在这个答案中,我将给出一个完整的示例,我使用Python 3.5进行了测试,但也适用于Python 2.7(和Python 3.x):

首先,让我们创建数据框:

# Create dataframe
import pandas as pd
import numpy as np

np.random.seed(0)
number_of_samples = 10
frame = pd.DataFrame({
    'feature1': np.random.random(number_of_samples),
    'feature2': np.random.random(number_of_samples),
    'class':    np.random.binomial(2, 0.1, size=number_of_samples),
    },columns=['feature1','feature2','class'])

print(frame)

给出了:

   feature1  feature2  class
0  0.548814  0.791725      1
1  0.715189  0.528895      0
2  0.602763  0.568045      0
3  0.544883  0.925597      0
4  0.423655  0.071036      0
5  0.645894  0.087129      0
6  0.437587  0.020218      0
7  0.891773  0.832620      1
8  0.963663  0.778157      0
9  0.383442  0.870012      0

要将此数据框导入MySQL表:

# Import dataframe into MySQL
import sqlalchemy
database_username = 'ENTER USERNAME'
database_password = 'ENTER USERNAME PASSWORD'
database_ip       = 'ENTER DATABASE IP'
database_name     = 'ENTER DATABASE NAME'
database_connection = sqlalchemy.create_engine('mysql+mysqlconnector://{0}:{1}@{2}/{3}'.
                                               format(database_username, database_password, 
                                                      database_ip, database_name))
frame.to_sql(con=database_connection, name='table_name_for_df', if_exists='replace')

一个诀窍是MySQLdb不能与Python 3.x一起使用。因此,我们使用mysqlconnector,可能是installed,如下所示:

pip install mysql-connector==2.1.4  # version avoids Protobuf error

输出:

enter image description here

请注意,to_sql会创建表以及列,如果数据库中尚不存在这些列。

答案 2 :(得分:2)

您可以使用pymysql:

来完成

例如,假设您有一个包含下一个用户,密码,主机和端口的MySQL数据库,并且您希望在数据库'data_2'中写入,如果它已经存在则

import pymysql
user = 'root'
passw = 'my-secret-pw-for-mysql-12ud'
host =  '172.17.0.2'
port = 3306
database = 'data_2'

如果您已创建数据库

conn = pymysql.connect(host=host,
                       port=port,
                       user=user, 
                       passwd=passw,  
                       db=database,
                       charset='utf8')

data.to_sql(name=database, con=conn, if_exists = 'replace', index=False, flavor = 'mysql')

如果您没有创建数据库,当数据库已经存在时也有效:

conn = pymysql.connect(host=host, port=port, user=user, passwd=passw)

conn.cursor().execute("CREATE DATABASE IF NOT EXISTS {0} ".format(database))
conn = pymysql.connect(host=host,
                       port=port,
                       user=user, 
                       passwd=passw,  
                       db=database,
                       charset='utf8')

data.to_sql(name=database, con=conn, if_exists = 'replace', index=False, flavor = 'mysql')

类似的主题:

  1. Writing to MySQL database with pandas using SQLAlchemy, to_sql
  2. Writing a Pandas Dataframe to MySQL

答案 3 :(得分:1)

您可以将DataFrame输出为csv文件,然后使用mysqlimport将csv导入mysql

修改

似乎pandas's build-in sql util提供了write_frame函数,但仅适用于sqlite。

我发现了一些有用的内容,您可以尝试this

答案 4 :(得分:1)

The to_sql method works for me.

However, keep in mind that the it looks like it's going to be deprecated in favor of SQLAlchemy:

FutureWarning: The 'mysql' flavor with DBAPI connection is deprecated and will be removed in future versions. MySQL will be further supported with SQLAlchemy connectables. chunksize=chunksize, dtype=dtype)

答案 5 :(得分:1)

Python 2 + 3

Prerequesites

  • 熊猫
  • MySQL服务器
  • SQLAlchemy的
  • pymysql:纯python mysql客户端

代码

"sport_events": [
    {
        "id": "sr:match:12606716",
        "scheduled": "2017-09-27T10:00:00+00:00",
        "start_time_tbd": false,
        "status": "closed",
        "tournament_round": {
            "type": "group",
            "number": 1,
            "group": "Gr. 4"
        },
        "season": {
            "id": "sr:season:45960",
            "name": "U17 European Ch.ship QF 17/18",
            "start_date": "2017-09-27",
            "end_date": "2018-04-30",
            "year": "17/18",
            "tournament_id": "sr:tournament:755"
        },
        "tournament": {
            "id": "sr:tournament:755",
            "name": "U17 European Ch.ship QF",
            "sport": {
                "id": "sr:sport:1",
                "name": "Soccer"
            },
            "category": {
                "id": "sr:category:392",
                "name": "International Youth"
            }
        },
        "competitors": [
            {
                "id": "sr:competitor:22646",
                "name": "Russia",
                "country": "Russia",
                "country_code": "RUS",
                "abbreviation": "RUS",
                "qualifier": "home"
            },
            {
                "id": "sr:competitor:22601",
                "name": "Faroe Islands",
                "country": "Faroe Islands",
                "country_code": "FRO",
                "abbreviation": "FRO",
                "qualifier": "away"
            }
        ]
    },

答案 6 :(得分:0)

这对我有用。最初,我只创建了数据库,没有创建任何预定义表。

from platform import python_version
print(python_version())
3.7.3

path='glass.data'
df=pd.read_csv(path)
df.head()


!conda install sqlalchemy
!conda install pymysql

pd.__version__
    '0.24.2'

sqlalchemy.__version__
'1.3.20'

安装后重新启动内核。

from sqlalchemy import create_engine
engine = create_engine('mysql+pymysql://USER:PASSWORD@HOST:PORT/DATABASE_NAME', echo=False)

try:
df.to_sql(name='glasstable',con=engine,index=False, if_exists='replace')
print('Sucessfully written to Database!!!')

except Exception as e:
    print(e)

答案 7 :(得分:0)

这应该可以解决问题:

import sys
import csv

# Create a dict (where the values are a list) to store the data in memory
database = {}

# Open the csv file and read the contents into memory
filename = sys.argv[1]
with open(filename, "r") as file:
    reader = csv.reader(file)
    # Ignore the header
    next(reader)
    for row in reader:
        # Read the first column of the csv (name) as the key, then read the remaining columns as a list for the values
        database[row[0]] = [int(row[x]) for x in range (1, len (row))]
    print(database)

答案 8 :(得分:-1)

  

df.to_sql(name =“ owner”,con = db_connection,schema ='aws',if_exists ='replace',index => True,index_label ='id')