我有一张名为' CompanyData '包含各种公司的数据。我需要将数据存储在与每个公司名称对应的文件中。
列是
c_emp_id, name, ph, email, company_name, country
如何使用Python解决问题
我的尝试:
import os
import pymysql
user = '***'
password = '***'
host = '***'
db= '***'
connection = pymysql.connect(host, user, password, db)
cursor = connection.cursor()
query = "select * from CompanyData"
cursor.execute(query)
results = cursor.fetchall()
for value in results:
filename = "{}.txt".format(value[4])
if os.path.isfile(filename )
fh = open(filename, 'w')
string1 = "{}-{}-{}\n".format(value[1], value[2], value[3])
if 'fh' in locals():
fh.write(string1)
我在这里简化了我的问题,以便人们可以理解这个问题。
答案 0 :(得分:3)
以下是使用pandas
的解决方案。关键是按公司名称对数据进行分组,然后将每个组保存到不同的文件中。
import pandas as pd
df = pd.DataFrame({'name': ['A', 'B', 'C'], 'company': ['AAA', 'BBB', 'AAA']}) # Example of data
# company name
# 0 AAA A
# 1 BBB B
# 2 AAA C
groups = df.groupby('company')
for company, group in groups:
group.to_csv('{0}.txt'.format(company), sep='-')
在此示例中,将创建两个文件:AAA.txt
和BBB.txt
。这些文件的内容将是:
-company-name
0-AAA-A
2-AAA-C
和
-company-name
1-BBB-B
为了将您的mysql数据库转换为pandas DataFrame,您可以执行以下操作:
import mysql.connector as sql
import pandas as pd
db_connection = sql.connect(host='hostname', database='db_name', user='username', password='password')
df = pd.read_sql('SELECT * FROM table_name', con=db_connection)
答案 1 :(得分:1)
我不确定"是否"可以提供帮助,但我可以尝试帮助解决代码。
首先收集dict中公司的所有数据,然后执行写入,同时尝试使用"" " open"的声明,这将处理文件的关闭。
import os
import pymysql
user = '***'
password = '***'
host = '***'
db= '***'
connection = pymysql.connect(host, user, password, db)
cursor = connection.cursor()
query = "select * from CompanyData"
cursor.execute(query)
results = cursor.fetchall()
company_data = {}
# collect data into a dict
for value in results:
company = value[4]
try:
current_data = company_data[company]
current_data += "\n" + "-".join([value[1], value[2], value[3]])
company_data[company] = current_data
except KeyError:
current_data = "-".join([value[1], value[2], value[3]])
# write the data into the file
for company, data in company_data.iteritems():
filename = "%s.txt" % company
with open(filename, 'w') as fh:
fh.write(data)
保存到dict可能会创建很多中间字符串(current_data + =" \ n" +" - " .join([value [1],value [2] ,价值[3]]));不确定以下使用列表是否是更好的实现。
# collect data into a dict
for value in results:
company = value[4]
try:
current_data = company_data[company]
# since lists are mutable we do not need to re-assign this back to dict
current_data.append("-".join([value[1], value[2], value[3]])
except KeyError:
current_data = "-".join([value[1], value[2], value[3]])
# write the data into the file
for company, data in company_data.iteritems():
filename = "%s.txt" % company
with open(filename, 'w') as fh:
for line in data:
fh.write(line + "\n")