非批量上传到应用数据存储区

时间:2013-07-02 02:17:10

标签: google-app-engine

我有一个非常大的SQLite数据库。对于SQLite数据库来说很大,无论如何:1.13 GB。我试图批量上传它,除了我无法将数据库转储到CSV。我曾多次尝试过,基本上已经放弃了。

更实用的方法似乎使我在https://developers.google.com/appengine/docs/python/datastore/#Python_Datastore_API找到的代码能够一次上传到数据存储区一条记录。让它一夜之间运行。那种事。

#-------------------------------------------------------------------------------
# Name:        App Data Store
# Purpose:      Move chess database to the app engine datastore
#               from c:\\PGNSDB
#               Comcast is the worst company in the world
# Created:     22/06/2013
# Copyright:   (c) Administrator 2013
#-------------------------------------------------------------------------------

from google.appengine.ext import db
from google.appengine.api import users
import sqlite3
import google
import logging

class game(db.Model):
        Event = db.StringProperty(required=False)
        Site = db.StringProperty(required=False)
        EventDate = db.StringProperty(required=False, indexed=True)
        Round = db.StringProperty(required=False)
        White = db.StringProperty(required=True, indexed=True)
        Black = db.StringProperty(required=True, indexed=True)
        Result = db.StringProperty(required=True,
                            choices=set(["1-0","0-1","1/2-1/2"]), indexed=True)
        ECO = db.StringProperty(required=False)
        WhiteELO = db.StringProperty(required=False)
        BlackELO = db.StringProperty(required=False)
        PlyCount = db.StringProperty(required=False)
        PGN = db.StringProperty(required=True)
        email = db.StringProperty()

def main():
    logging.info('Beginning upload')
    conn = sqlite3.connect('C:\\PGNSDB')
    c = conn.cursor()
    games = c.execute("select Event, Site, Date, Round, White, Black, Result, ECO, WhiteELO, BlackELO, PGN from games")

    logging.info('Local database is now open on C drive.')

    for agame in games:
        logging.info('Uploading a PGN.')
        thisgame = game(Event = agame[0],
                        Site = agame[1],
                        EventDate = agame[2],
                        Round = agame[3],
                        White = agame[4],
                        Black = agame[5],
                        Result = agame[6],
                        ECO = agame[7],
                        WhiteELO = agame[8],
                        BlackELO = agame[9],
                        PGN = agame[10],
                        #email = users.get_current_user().email())
                        email = "xxx@gmail.com")
        logging.info('About to put.')
        thisgame.put()

if __name__ == '__main__':
    main()

所以我在Google App Engine Launcher中运行了以下app.yaml:

application: pgnhelper
version: 1
runtime: python27
api_version: 1
threadsafe: true

handlers:
- url: /
  script: home.app

- url: /index\.html
  script: home.app

- url: /stylesheets
  static_dir: stylesheets

- url: /(.*\.(gif|png|jpg))
  static_files: static/\1
  upload: static/(.*\.(gif|png|jpg))

- url: /admin/.*
  script: admin.app
  login: admin

- url: /.*
  script: not_found.app

builtins:
- remote_api: on

...并输出以下内容:

2013-07-01 21:40:59 Running command: "['C:\\Python27\\pythonw.exe', 'C:\\Program Files (x86)\\Google\\google_appengine\\dev_appserver.py', '--skip_sdk_update_check=yes', '--port=8080', '--admin_port=8000', u'C:\\Code\\uploadpgns']"
INFO     2013-07-01 21:41:06,479 devappserver2.py:528] Skipping SDK update check.
WARNING  2013-07-01 21:41:06,530 api_server.py:314] Could not initialize images API; you are likely missing the Python "PIL" module.
WARNING  2013-07-01 21:41:06,546 simple_search_stub.py:955] Could not read search indexes from c:\users\jj\appdata\local\temp\appengine.pgnhelper\search_indexes
INFO     2013-07-01 21:41:06,612 api_server.py:138] Starting API server at: http : //XXX:58254
INFO     2013-07-01 21:41:06,621 dispatcher.py:164] Starting server "default" running at: http : //XXX:8080
INFO     2013-07-01 21:41:06,627 admin_server.py:117] Starting admin server at: http : //XXX:8000
INFO     2013-07-01 21:46:46,724 api_server.py:509] Applying all pending transactions and saving the datastore
INFO     2013-07-01 21:46:46,724 api_server.py:512] Saving search indexes
2013-07-01 21:46:46 (Process exited with code 0)

2013-07-01 21:53:53 Running command: "['C:\\Python27\\pythonw.exe', 'C:\\Program Files (x86)\\Google\\google_appengine\\dev_appserver.py', '--skip_sdk_update_check=yes', '--port=8080', '--admin_port=8000', u'C:\\Code\\uploadpgns']"
INFO     2013-07-01 21:53:54,956 devappserver2.py:528] Skipping SDK update check.
WARNING  2013-07-01 21:53:54,963 api_server.py:314] Could not initialize images API; you are likely missing the Python "PIL" module.
INFO     2013-07-01 21:53:54,974 api_server.py:138] Starting API server at: http : //XXX:58311
INFO     2013-07-01 21:53:54,980 dispatcher.py:164] Starting server "default" running at: http : //XXX:8080
INFO     2013-07-01 21:53:54,984 admin_server.py:117] Starting admin server at: http : //XXX:8000
ERROR    2013-07-02 01:54:46,207 wsgi.py:219] 

Traceback (most recent call last):

  File "C:\Program Files (x86)\Google\google_appengine\google\appengine\runtime\wsgi.py", line 196, in Handle

    handler = _config_handle.add_wsgi_middleware(self._LoadHandler())

  File "C:\Program Files (x86)\Google\google_appengine\google\appengine\runtime\wsgi.py", line 255, in _LoadHandler

    handler = __import__(path[0])

ImportError: No module named home

INFO     2013-07-01 21:54:46,223 server.py:593] default: "GET / HTTP/1.1" 500 -
ERROR    2013-07-02 01:54:46,325 wsgi.py:219] 

Traceback (most recent call last):

  File "C:\Program Files (x86)\Google\google_appengine\google\appengine\runtime\wsgi.py", line 196, in Handle

    handler = _config_handle.add_wsgi_middleware(self._LoadHandler())

  File "C:\Program Files (x86)\Google\google_appengine\google\appengine\runtime\wsgi.py", line 255, in _LoadHandler

    handler = __import__(path[0])

ImportError: No module named not_found

INFO     2013-07-01 21:54:46,332 server.py:593] default: "GET /favicon.ico HTTP/1.1" 500 -

... XXX表示与localhost的链接......我想我错过了关于这个WSGI的一些非常基本的东西。而模块名为not_found?无法找到它!

我发现的示例没有提及网关接口。我如何合并?

由于

1 个答案:

答案 0 :(得分:1)

您根本不需要使用开发应用服务器(或本地app.yaml等)。

将代码放入模块(没有处理程序等)并将其导入remote_api_shell。 https://developers.google.com/appengine/articles/remote_api

然后您可以导入任何您想要的东西,因为您没有在沙盒中运行,而是直接与appengine数据存储区对话。

您还可以通过批量处理游戏对象来加快速度。比如在列表中每100个存储一次,然后执行db.put(the_list)