读取appengine backup_info文件会产生EOFError

时间:2014-01-10 02:30:00

标签: python google-app-engine backup google-cloud-storage gsutil

我正在尝试检查我的appengine备份文件,以便在发生数据损坏时解决问题。我使用gsutil找到并下载文件:

gsutil ls -l gs://my_backup/ > my_backup.txt
gsutil cp gs://my_backup/LongAlphaString.Mymodel.backup_info file://1.backup_info

然后我创建了一个小python程序,尝试读取该文件并使用appengine库解析它。

#!/usr/bin/python

APPENGINE_PATH='/Applications/GoogleAppEngineLauncher.app/Contents/Resources/GoogleAppEngine-default.bundle/Contents/Resources/google_appengine/'
ADDITIONAL_LIBS = [
'lib/yaml/lib'
]
import sys
sys.path.append(APPENGINE_PATH)
for l in ADDITIONAL_LIBS:
  sys.path.append(APPENGINE_PATH+l)

import logging
from google.appengine.api.files import records
import cStringIO

def parse_backup_info_file(content):
  """Returns entities iterator from a backup_info file content."""
  reader = records.RecordsReader(cStringIO.StringIO(content))
  version = reader.read()
  if version != '1':
    raise IOError('Unsupported version')
  return (datastore.Entity.FromPb(record) for record in reader)


INPUT_FILE_NAME='1.backup_info'

f=open(INPUT_FILE_NAME, 'rb')
f.seek(0)
content=f.read()
records = parse_backup_info_file(content)
for r in records:
  logging.info(r)

f.close()

复制了parse_backup_info_file的代码 backup_handler.py

当我运行程序时,我得到以下输出:

./view_record.py 
Traceback (most recent call last):
  File "./view_record.py", line 30, in <module>
    records = parse_backup_info_file(content)
  File "./view_record.py", line 19, in parse_backup_info_file
    version = reader.read()
  File "/Applications/GoogleAppEngineLauncher.app/Contents/Resources/GoogleAppEngine-default.bundle/Contents/Resources/google_appengine/google/appengine/api/files/records.py", line 335, in read
    (chunk, record_type) = self.__try_read_record()
  File "/Applications/GoogleAppEngineLauncher.app/Contents/Resources/GoogleAppEngine-default.bundle/Contents/Resources/google_appengine/google/appengine/api/files/records.py", line 307, in __try_read_record
    (length, len(data)))
EOFError: Not enough data read. Expected: 24898 but got 2112

我尝试了六个不同的backup_info文件,它们都显示相同的错误(使用不同的数字。) 我注意到它们都具有相同的预期长度: 当我进行观察时,我正在审查同一模型的不同版本,当我查看其他模块的备份文件时,情况并非如此。

EOFError: Not enough data read. Expected: 24932 but got 911
EOFError: Not enough data read. Expected: 25409 but got 2220

我的方法有什么明显的错误吗?

我想另一种选择是appengine backup utility没有创建有效的备份文件。 你能提出的任何其他建议都会非常受欢迎。 在此先感谢

1 个答案:

答案 0 :(得分:2)

运行AppEngine数据存储区备份时,会创建多个元数据文件:

LongAlphaString.backup_info 创建一次。其中包含有关在数据存储备份中创建的所有实体类型和备份文件的元数据。

LongAlphaString。[EntityType] .backup_info 每个实体类型创建一次。其中包含有关为[EntityType]创建的特定备份文件的元数据以及[EntityType]的架构信息。

您的代码可用于查询LongAlphaString.backup_info的文件内容,但您似乎正在尝试查询LongAlphaString的文件内容。[EntityType] .backup_info。这是一个脚本,它将以人类可读的格式为每种文件类型打印内容:

import cStringIO
import os
import sys

sys.path.append('/usr/local/google_appengine')
from google.appengine.api import datastore
from google.appengine.api.files import records
from google.appengine.ext.datastore_admin import backup_pb2

ALL_BACKUP_INFO = 'long_string.backup_info'
ENTITY_KINDS = ['long_string.entity_kind.backup_info']


def parse_backup_info_file(content):
    """Returns entities iterator from a backup_info file content."""
    reader = records.RecordsReader(cStringIO.StringIO(content))
    version = reader.read()
    if version != '1':
        raise IOError('Unsupported version')
    return (datastore.Entity.FromPb(record) for record in reader)


print "*****" + ALL_BACKUP_INFO + "*****"
with open(ALL_BACKUP_INFO, 'r') as myfile:
    parsed = parse_backup_info_file(myfile.read())
    for record in parsed:
        print record

for entity_kind in ENTITY_KINDS:
    print os.linesep + "*****" + entity_kind + "*****"
    with open(entity_kind, 'r') as myfile:
        backup = backup_pb2.Backup()
        backup.ParseFromString(myfile.read())
        print backup