Question

我有一个像这样的大型JSON文件：

{
  "data" : [
    {"album": "I Look to You", "writer": "Leon Russell", "artist": "Whitney Houston", "year": "2009", "title": "\"A Song for You\""},
    {"album": "Michael Zager Band", "writer": "Michael Zager", "artist": "Whitney Houston", "year": "1983", "title": "\"Life's a Party\""},
    {"album": "Paul Jabara & Friends", "writer": "Paul Jabara", "artist": "Whitney Houston", "year": "1978", "title": "\"Eternal Love\""},
    ...

...我试图制作一个非常简单的API来获取不同的值。现在我可以很容易地获得localhost/data/1/title例如获得第一个标题值，但我想通过localhost/titles或其他东西获得所有标题。如何在此处修改do_GET方法以添加此类功能？

def do_GET(self):
    self.send_response(200)
    self.send_header('Content-type', 'application/json')
    self.end_headers()

    path = self.path[1:]
    components = string.split(path, '/')

    node = content
    for component in components:
        if len(component) == 0 or component == "favicon.ico":
            continue

        if type(node) == dict:
            node = node[component]

        elif type(node) == list:
            node = node[int(component)]

    self.wfile.write(json.dumps(node))

    return

Answer 1

这是一个符合当前动态网址格式的答案，无需进行重大架构更改或要求。

在这里，我在你的url模式中使用“all”替换给定的数字索引，因为我觉得这更好地代表了data/[item(s)]/[attribute]的范例

以下是一些网址和示例输出：

/data/1/album =＆gt; “Michael Zager Band”
/data/0/title =＆gt; “为你而歌”
/data/all/title =＆gt; [“为你而歌”，“生命是党”，“永恒的爱”]
/data/all/year =＆gt; [“2009”，“1983”，“1978”]
/data/1 =＆gt; {“专辑”：“Michael Zager乐队”，“标题”：“生活是一个派对”，“作家”：“Michael Zager”，“年”：“1983”，“艺术家”：“Whitney Houston”}

PS - 我改变了架构，使用递归，我认为更好地遵循你想做的事情。

def do_GET(self):
    self.send_response(200)
    self.send_header('Content-type', 'application/json')
    self.end_headers()

    path = self.path[1:]
    components = string.split(path, '/')

    node = parse_node(content, components)

    self.wfile.write(json.dumps(node))

    return

def parse_node(node, components):
    # For a valid node and component list:
    if node and len(components) and components[0] != "favicon.ico":
        # Dicts will return parse_node of the top-level node component found, 
        # reducing the component list by 1
        if type(node) == dict:
            return parse_node(node.get(components[0], None), components[1:])

        elif type(node) == list:
            # A list with an "all" argument will return a full list of sub-nodes matching the rest of the URL criteria
            if components[0] == "all":
                return [parse_node(n, components[1:]) for n in node]
            # A normal list node request will work as it did previously
            else:
                return parse_node(node[int(components[0])], components[1:])
    else:
        return node

    # Handle bad URL
    return None

Answer 2

我认为您遇到了麻烦，因为您正在尝试遍历路径组件以确定要执行的操作。这是解决问题的一种有点复杂的方法。

我首先要定义＆＃34;路线＆＃34;或＆＃34;行动＆＃34;您希望您的API支持，然后编写代码来处理每个API。这就是大多数Web框架的运作方式（例如django's URL patterns或flask's routes）。在代码中使用相同的模式非常简单。

因此，根据您的描述，您似乎想要两条路线：

/data/{id}/{attr} - look up the value of `attr` for the given `id`
/{attr} - search all items for `attr`

我也将简化＆＃34; title＆＃34; vs.＆＃34; titles＆＃34;并且只使用单数形式，因为多元化可能比它的价值更麻烦。但如果你真的想这样做，有些图书馆可以提供帮助（例如this one）。

一旦我们确定网址将遵循这两种模式，就可以轻松检查组件是否与这些模式匹配。请注意，我在此处简化了您的代码以使其运行，因为我不确定如何调用do_GET或self是什么：

import json

JSON = {
    "data" : [
        {"album": "I Look to You", "writer": "Leon Russell", "artist": "Whitney Houston", "year": "2009", "title": "\"A Song for You\""},
        {"album": "Michael Zager Band", "writer": "Michael Zager", "artist": "Whitney Houston", "year": "1983", "title": "\"Life's a Party\""},
        {"album": "Paul Jabara & Friends", "writer": "Paul Jabara", "artist": "Whitney Houston", "year": "1978", "title": "\"Eternal Love\""},
    ]
}

def do_GET(path):
    path = path[1:]
    components = path.split('/')

    if components[0] == 'favicon.ico':
        return "favicon response"
    elif len(components) == 0 or not path:
        return "error response"
    elif len(components) == 3 and components[0] == "data":
        #/data/{id}/{attr} - look up the value of `attr` for the given `id`
        key, item_id, attr = components
        item_id = int(item_id)
        return json.dumps(JSON[key][item_id][attr])
    elif len(components) == 1:
        #/{attr} - search all items for `attr`
        attr = components[0]
        out = []
        for k in JSON:
            for d in JSON[k]:
                if attr in d:
                    out.append(d[attr])
        return json.dumps(out)
    else:
        return "unknown response"

    return json.dumps(node)

if __name__ == "__main__":
    urls = [
        "/data/1/title",
        "/title",
        "/some_missing_attr",
        "/favicon.ico",
        "/",
    ]
    for u in urls:
        print u, "->", do_GET(u)

输出：

/data/1/title -> "\"Life's a Party\""
/title -> ["\"A Song for You\"", "\"Life's a Party\"", "\"Eternal Love\""]
/some_missing_attr -> []
/favicon.ico -> favicon response
/ -> error response

这应该可以正常工作，除非你真的想在任意JSON中进行任意嵌套查找。如果是这种情况，那么我不认为你提出的网址会有效，你怎么知道＆＃34; / titles＆＃34;应该搜索所有元素和＆＃34; / data＆＃34;会查找一个元素吗？如果你真的想这样做，我会搜索谷歌搜索＆＃34; JSON查询语言＆＃34;并了解哪些项目可以重复使用或从中获取想法。

Answer 3

这是一个非常模糊的想法，但我希望这个概念能被理解。

在此示例中，如果网址不以“数据”开头，则会将数据集合映射到网址中指定的组件（如“标题”）。

def do_GET(self):
    self.send_response(200)
    self.send_header('Content-type', 'application/json')
    self.end_headers()

    path = self.path[1:]
    components = string.split(path, '/')

    if components and components[0] != 'data':
        node = map(lambda x: x.get(components[0]), content)
    else:
        node = content
        for component in components:
            if len(component) == 0 or component == "favicon.ico":
                continue

            if type(node) == dict:
                node = node[component]

            elif type(node) == list:
                node = node[int(component)]

    self.wfile.write(json.dumps(node))

    return

如何从Python中的dicts列表中获取所有一个键？

3 个答案: