我将查询导入pandas数据框,然后创建一个名为hindex
的结果数据框,以导入到我的数据库表中,如下所示:
import sqlite3
import numpy as np
import pandas as pd
#access the database created
db = sqlite3.connect('test/publications')
df = pd.read_sql("select AuthorID as ID, citations from publications as p join authors_publications as a on p.ID=a.PaperID order by AuthorID, citations desc", db)
df2 = df.sort(['ID','Citations'],ascending=['Citations','ID'])
groups = df2.groupby('ID')
ind2 = np.array([np.arange(len(g))+1 for g in groups.groups.itervalues()])
df2['newindex'] = np.hstack(ind2)
df2['condition'] = df2['Citations']>=df2['newindex']
hindex = df2.groupby('ID').sum()['condition']
hindex.to_sql('authors_hindex', db, flavor='sqlite', if_exists='replace', index=True)
我之前使用过to_sql并且它有效。不知道为什么它不在这里。我收到以下错误:
AttributeError Traceback (most recent call last)
<ipython-input-4-0748af5dad1d> in <module>()
43
44 print hindex
---> 45 hindex.to_sql('authors_hindex', db, flavor='sqlite', if_exists='replace', index=True)
46
/usr/lib/python2.7/dist-packages/pandas/core/generic.pyc in __getattr__(self, name)
1813 return self[name]
1814 raise AttributeError("'%s' object has no attribute '%s'" %
-> 1815 (type(self).__name__, name))
1816
1817 def __setattr__(self, name, value):
AttributeError: 'Series' object has no attribute 'to_sql'
答案 0 :(得分:2)
试试这个:
hindex = df2.groupby('ID').sum()[['condition']]
所以使用双[[]]
将返回一个df
您的原始行:hindex = df2.groupby('ID').sum()['condition']
返回了一个系列,但这确实有一个to_sql
方法,但不清楚为什么会失败。