如何在Python中访问数据中的数据?

时间:2018-03-23 04:37:40

标签: python mongodb python-2.7 analysis

我有这种格式的Python dict:

test_scr = { 
    "visited_pages" : [ { 
          "visited_page_id" : { 
              "$oid" : "57d01dd3f1a475f7307b23d9" 
          }, "url" : "google.com", 
         "page_height" : "3986", 
         "visited_on" : { 
             "$date" : 1473256915000 
          }, "visited_page_clicks" : [ { 
                "x" : "887", 
                "y" : "35", 
                "page_height" : "3986", 
                "created" : { 
                    "$date" : 1473256920000 
                 } 
            } ], 
         "total_clicks" : 1, 
         "total_time_spent_in_minutes" : "0.10", 
         "total_mouse_moves" : 0 
      }, { 
          "visited_page_id" : { 
              "$oid" : "57d01dddf1a475a6377b23d4" 
          }, "url" : "google.com", 
         "page_height" : "3088", 
         "visited_on" : { 
             "$date" : 1473256925000 
          }, "visited_page_clicks" : [ {
                "x" : "888", 
                "y" : "381", 
                "page_height" : "3088", 
                "created" : { 
                    "$date" : 1473256934000 
                 } 
             },{
                "x" : "888", 
                "y" : "381", 
                "page_height" : "3088",
                "created" : { 
                    "$date" : 1473256935000 
                 } 
             },{ 
                 "x" : "875", 
                 "y" : "364",
                 "page_height" : "3088",
                  "created" : { 
                     "$date" : 1473256936000 
                 } 
             },{ 
                 "x" : "875",
                 "y" : "364",
                 "page_height" : "3088",
                 "created" : { 
                      "$date" : 1473256936000 
                  } 
             }, {
                 "x" : "875", 
                 "y" : "364",
                 "page_height" : "3088",
                 "created" : {
                      "$date" : 1473256937000 
                  } 
             },{ 
                 "x" : "1347",
                 "y" : "445", 
                 "page_height" : "3088", 
                 "created" : { 
                      "$date" : 1473256942000 
                  } 
             },{ 
                  "x" : "259", 
                  "y" : "798", 
                  "page_height" : "3018", 
                  "created" : { 
                       "$date" : 1473257244000 
                  } 
             },{ 
                  "x" : "400", 
                  "y" : "98", 
                  "page_height" : "3088",
                  "created" : { 
                       "$date" : 1473257785000 
                  } 
             }],"total_clicks" : 8, 
                "total_time_spent_in_minutes" : "14.26", 
                "total_mouse_moves" : 0 
         }, { 
            "visited_page_id" : { 
                    "$oid" : "57d0213ff1a475a6377b23d5" 
            },"url" : "google.com",
            "page_height" : "3088",
            "visited_on" : { 
                    "$date" : 1473257791000 
            },"visited_page_clicks" : [ { 
                  "x" : "805", 
                  "y" : "425", 
                  "page_height" : "3088", 
                  "created" : { 
                        "$date" : 1473257826000 
                  } 
              }, {
                  "x" : "523", 
                  "y" : "100", 
                  "page_height" : "3088", 
                  "created" : { 
                        "$date" : 1473257833000 
                  } 
            } ], "total_clicks" : 2, 
            "total_time_spent_in_minutes" : "0.47", 
            "total_mouse_moves" : 0 
        } 
    }

我必须从dict中提取X和Y值,并将它们以矩阵形式存储在数据框中。 输出应该是这样的:

X       Y
887     35
888     381
888     381
875     364
.        .
.        .
.        .

我该怎么做?

2 个答案:

答案 0 :(得分:1)

你的词典在这篇文章中的格式非常糟糕,但我写了一个快速的小脚本,它能够循环并从字典中获取x和y值。
您可以使用dictionary["key"]语法访问字典值。它将返回为该键存储的值或对象。

# Two lists to store the x and y values in    
x = []
y = []

# Store the visited_pages object in a list
visited_pages = test_scr["visited_pages"]

# Loop through all the pages
for page in visited_pages:
    page_clicks = page["visited_page_clicks"]
    # Loop through all the clicks for the page
    for click in page_clicks:
        # Add the x and y values to the lists
        x.append(click["x"])
        y.append(click["y"])

答案 1 :(得分:0)

您可以使用列表理解

来完成此操作
coords = [[click['x'],click['y']] for page in test_scr['visited_pages'] for click in page['visited_page_clicks']]

您可以使用各种技术将其转换为数据框,或者以您想要的格式重新塑造它们。

另外,请正确格式化代码

输出

[['887', '35'],
['888', '381'],
['888', '381'],
['875', '364'],
['875', '364'],
['875', '364'],
['1347', '445'],
['259', '798'],
['400', '98'],
['805', '425'],
['523', '100']]