转换python列表为mongodb导入的问题

时间:2011-10-12 22:09:09

标签: python sql-server mongodb

我是python,mongodb和sql的新手。我在Mac 10.7上使用Eclipse 3.7.1。我使用pyodbc驱动程序(和freeTDS)连接到mssql数据库。我在python 2.7中编写脚本。我想查询mssql数据库并将其写入mongo数据库。

我绊倒的是查询输出是在没有字段名称的元组的python列表中,我正在寻找一种方法将这个元组列表转换为mongodb将导入的形式。

当前脚本:

    ############
    # Query mssql
    import pyodbc
    import json
    url = 'DSN=myServer;UID=myUserName;PWD=myPassword;PORT=1433;DATABASE=mydb'
    pyodbccon = pyodbc.connect(url)
    cursor = pyodbccon.cursor()

    numusersQ = "SELECT COUNT(users.userid) FROM users"; 
    cursor.execute(numusersQ); numusers = cursor.fetchall()
    nummembsQ = "SELECT COUNT(memberships.membernumber) FROM memberships"; 
    cursor.execute(nummembsQ); nummembs = cursor.fetchall()
    userclientQ = "SELECT users.userid, users.client, users.industry FROM users"
    cursor.execute(userclientQ); userclient = cursor.fetchall()

    #format key value tuples
    output = []
    for row in userclient:
        tuplenew = {'userid': row[0], 'client': row[1], 'industry': row[2], 'numusers': numusers, 'nummembs': nummembs}
        output = [output, tuplenew]


    #output to mongo
    from pymongo.connection import Connection ;
    conmongo = Connection('localhost') 
    db = conmongo.mypymongodb

    for key, value in output():
        temp = [key,value]
        mongooutput.append(temp)

    db.pymongocollection.save(mongooutput)
    cursor = db.pymongocollection.find() 

############

OUTPUT看起来像:

[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[], {'numusers': [(103068, )], 'industry': 'npwild', 'client': 'cmh', 'userid': 1, 'nummembs': [(10519, )]}], {'numusers': [(103068, )], 'industry': 'npwild', 'client': 'cmh', 'userid': 2, 'nummembs': [(10519, )]}], {'numusers': [(103068, )], 'industry': 'npwild', 'client': 'cmh', 'userid': 3, 'nummembs': [(10519, )]}], {'numusers': [(103068, )], 'industry': 'npwild', 'client': 'cmh', 'userid': 5, 'nummembs': [(10519, )]}]

错误消息:

Traceback (most recent call last):
  File "/Users/eclipse/workspace/pymongo/pymongopkg.py", line 34, in <module>
    for key, value in output():
TypeError: 'list' object is not callable

如果有人可以建议一个功能或指导我找到解决方案,那就太棒了。

2 个答案:

答案 0 :(得分:1)

这就是我开始工作的方式:

  1. pyodbc查询mssql数据库
  2. python dict和zip将元组列表转换为键值字典对
  3. pymongo将其保存为集合

    #===============================================================================
    # 1. MSSQL QUERY WITH PYODBC
    #===============================================================================
    import pyodbc
    url = 'DSN=myserver;UID=myusername;PWD=mypassword;PORT=1433;DATABASE=mydatabase;'
    pyodbccon = pyodbc.connect(url); cursor = pyodbccon.cursor()
    
    userdataQ = "SELECT users.userid, users.client, users.industry FROM users"
    cursor.execute(userdataQ); userdata = cursor.fetchall()
    
    ##===============================================================================
    ## 2. convert tuple list to key-value dictionary
    ## 3. export to mongodb
    ##===============================================================================        
    from pymongo import Connection; conmongo = Connection('localhost') 
    db = conmongo.mypymongodb #mypymongodb = dbname
    headers = ['userid','client','industry'] 
    
    for tup in userdata:
        nextdoc = dict(zip(headers, tup))
        db.usercollection.save(nextdoc)
    print "usercollection in mypymongodb updated with " + str(db.usercollection.count()) + " docs"
    
  4. 输出:

    > db.usercollection.find()
    { "_id" : ObjectId("4ef000000"), "industry" : "npwild", "client" : "cmh", "userid" : 1 }
    { "_id" : ObjectId("4ef000001"), "industry" : "npwild", "client" : "cmh", "userid" : 2 }
    { "_id" : ObjectId("4ef000002"), "industry" : "npwild", "client" : "cmh", "userid" : 3 }
    etc.
    

    谢谢你的帮助! -d

答案 1 :(得分:0)

免责声明:我对mongoDB没有任何经验,但由于它似乎导入了JSON格式的数据,看起来你几乎就在那里。

我看到的问题在于这一部分:

output = []
for row in userclient:
    tuplenew = {'userid': row[0], 'client': row[1], 'industry': row[2], 'numusers': numusers, 'nummembs': nummembs}
    output = [output, tuplenew] # <--- problematic line

您在此处执行的操作是将output列表分配给自己。如果您只是通过用output.append(tuplenew)替换有问题的行来将项目附加到其中,您将获得一个词典列表;类似于mongoDB所需的JSON格式。

通过迭代该列表,您可以将项目导出到数据库。我认为它会是这样的:

for item in output:
    mongooutput.append(item)

这就是我现在所能说的,因为你的问题中有一些不清楚的东西:追溯提到了一个我找不到的电话; mongooutput列表未定义;你在output列表上的迭代使用括号,就像它调用一个函数一样......总而言之,代码不太可能在这种状态下运行。