在python中阅读protobuf。提取数据

时间:2013-03-02 18:51:25

标签: python protocol-buffers

我正在尝试使用spinn3r的数据。数据作为protobuf返回。在python中,当我打印protobuf对象时,我得到了这个:

print data
source {
  link {
    href: ""
    resource: ""
  }
  canonical_link {
    href: "http://twitter.com/_PattiShaw/statuses/28167079857225728"
    resource: ""
  }
  title: ""
  hashcode: ""
  lang {
    code: "en"
    probability: -1.0
  }
  generator: ""
  description: ""
  last_posted: ""
  last_published: ""
  date_found: ""
  publisher_type: "MICROBLOG"
}
feed {
  link {
    href: ""
    resource: ""
  }
  canonical_link {
    href: ""
    resource: ""
  }
  title: ""
  hashcode: ""
  lang {
    code: "en"
    probability: -1.0
  }
  generator: ""
  description: ""
  last_posted: ""
  last_published: ""
  date_found: ""
  etag: ""
  channel_link {
    href: ""
    resource: ""
  }
}
feed_entry {
  link {
    href: "http://twitter.com/_PattiShaw/statuses/28167079857225728"
    resource: "http://twitter.com/_PattiShaw/statuses/28167079857225728"
  }
  canonical_link {
    href: "http://twitter.com/_PattiShaw/statuses/28167079857225728"
    resource: "http://twitter.com/_PattiShaw/statuses/28167079857225728"
  }
  title: "The value of a man resides in what he gives and not in what he is capable of receiving. ~ Albert Einstein"
  hashcode: "8WhKLK9Lyng"
  lang {
    code: "en"
    probability: -1.0
  }
  author {
    name: "_PattiShaw (Patti Shaw)"
    email: ""
    link {
      href: "http://twitter.com/_PattiShaw"
    }
  }
  spam_probability: 0.0
  last_published: "2011-01-20T19:08:49Z"
  date_found: "2011-01-20T19:08:49Z"
  identifier: 1295550574016007548
  content {
    mime_type: "text/html"
    data: "x\332M\214\301\r\2000\014\304V\271\t`\201\n\211\007\033\260@B\003\215TR\324\226\362cv\020/\276\266\3459\010\032\305S\220V\020v2d)\352\245@\rW\240\212\267\330\264\275\300\361@\346]\317\003,\325\277\327\202\205\016\342\370m\262,\242Mm\353pc\214,\271bR+U\324\036\200\236&\363"
    encoding: "zlib"
  }
}
permalink_entry {
  link {
    href: "http://twitter.com/_PattiShaw/statuses/28167079857225728"
    resource: "http://twitter.com/_PattiShaw/statuses/28167079857225728"
  }
  canonical_link {
    href: "http://twitter.com/_PattiShaw/statuses/28167079857225728"
    resource: "http://twitter.com/_PattiShaw/statuses/28167079857225728"
  }
  title: "The value of a man resides in what he gives and not in what he is capable of receiving. ~ Albert Einstein"
  hashcode: "8WhKLK9Lyng"
  lang {
    code: "en"
    probability: -1.0
  }
  author {
    name: "_PattiShaw (Patti Shaw)"
    email: ""
    link {
      href: "http://twitter.com/_PattiShaw"
    }
  }
  spam_probability: 0.0
  last_published: "2011-01-20T19:08:49Z"
  date_found: "2011-01-20T19:09:34Z"
  identifier: 1295550574016007548
  content {
    mime_type: "text/html"
    data: ""
  }
  content_extract {
    mime_type: "text/html"
    data: ""
  }
  generator: ""
}

我想从“feed_entry”对象中提取“作者姓名”。我试过这个:

print data.feed_entry.author.name

我收到错误:

AttributeError: 'RepeatedCompositeFieldContainer' object has no attribute 'name'

我尝试打印作者对象以查看会发生什么。这就是我得到的:

print u.feed_entry.author
[<spinn3rApi_pb2.Author object at 0x362e6d0>]

如何提取此作者姓名?

1 个答案:

答案 0 :(得分:4)

看起来u.feed_entry.author是一个列表。注意方括号:

[<spinn3rApi_pb2.Author object at 0x362e6d0>]

这可以解决您的问题(假设您至少有一位作者):

print data.feed_entry.author[0].name