Question

我正在研究python，以便在社交媒体上使用数据挖掘来分析数据。现在我已经编写了一个代码，它提供了有关Facebook最喜欢的页面的信息，并且我已将信息存储在名为"pages.txt"的文本文件中，以下是我的文本文件内容的快照：

{
 "paging": {
  "next": "https://graph.facebook.com/search?limit=1&type=page&q=%26&locale=ar_AR&access_token=CAACEdEose0cBAFxVPV6lJ43O6MABoxVrrHlb01rBNmpVf8ZCK0M1QlsEJ6yRZBWlzjf0vA1eX6YdwNHF2TLZBsECdg6Q8mI3BH3n5QTMsi55KtkCtOCd36AVxjZA7PXBL3mZA6FsLZCNp9IZCItCI4YVhCeikubnwCLpE0nSTOcKXR8DUzcZA4qZCBW92yoCDFk2z0eZBNSUU6lgZDZD&offset=1&__after_id=6127898346"
 }, 
 "data": [
  {
   "category": "\u0627\u0644\u062a\u0639\u0644\u064a\u0645", 
   "name": "The London School of Economics and Political Science - LSE", 
   "category_list": [
    {
     "id": "108051929285833", 
     "name": "\u0627\u0644\u0643\u0644\u064a\u0629 \u0648\u0627\u0644\u062c\u0627\u0645\u0639\u0629"
    }, 
    {
     "id": "187751327923426", 
     "name": "\u0645\u0646\u0638\u0645\u0629 \u062a\u0639\u0644\u064a\u0645\u064a\u0629"
    }
   ], 
   "id": "6127898346"
  }
 ]
}

现在我想知道如何从中获取特定字段（例如“id”：“6127898346”）？我已经尝试了很多，但我找不到办法。到目前为止我写过这个：

ins = open( "pages.txt", "r" )
values = []
for line in ins:   
    values.append(line) 

ins.close()
print values

但这给了我整条任何帮助吗？

Answer 1

这是JSON。您可以通过json模块加载数据来获取数据：

import json
with open(your_file).read() as content:
    data = json.loads(content)
    # manipulate your data

data将是普通的Python数据结构，例如嵌套列表，dict，字符串和int，因此您可以按照常规方式操作它们。

Answer 2

试试这个：

INFILE = open("pages.txt","r")
file = INFILE.readlines()

listA = []
ID_List = []
for line in file:
        if (line[6:8] =="id"):
        line = line.strip()
        listA.append(line)
    for id in listA:
        item = id[7:-2]
        item = item.strip()
        if item not in ID_List:
            ID_List.append(item)
print "List of all IDS:",ID_List

在文件上运行后，我得到：

>>> 
List of all IDS: ['108051929285833', '187751327923426', '612789834']
>>>

获取特定于文本文件的字段问题

2 个答案: