Question

所以我的任务基本上是读取一个文件（记事本文件），该文件有一堆列车停靠点和从一站到另一站的时间。例如，它看起来像：

Stop A     15
Stop B     12
Stop C     9

现在我需要返回并访问这些站点及其时间。我正在考虑阅读文件并将其存储为字典。我的问题是，字典是否最适合这个？或者是否有一些其他python工具会更有用？任何想法都将不胜感激！

Answer 1

我会反对这种说法 - 并说直言不讳是不是最好的。

假设您有100个停靠点和多个非字母和非数字路线。想想巴黎地铁：

Paris Subway

现在尝试使用直接的Python dict来计算FDR和La Fourche之间的时间？这涉及两个或更多不同的路线和多个选项。

tree或某种形式的graph是一种更好的结构。对于1对1的映射，dict非常棒;树对于彼此相关的节点的丰富描述更好。然后，您可以使用类似Dijkstra's Algorithm的内容进行导航。

由于nested dict of dicts or dict of lists是一个图表，因此很容易想出一个递归的例子：

def find_all_paths(graph, start, end, path=[]):
        path = path + [start]
        if start == end:
            return [path]
        if start not in graph:
            return []
        paths = []
        for node in graph[start]:
            if node not in path:
                newpaths = find_all_paths(graph, node, end, path)
                for newpath in newpaths:
                    paths.append(newpath)
        return paths       

def min_path(graph, start, end):
    paths=find_all_paths(graph,start,end)
    mt=10**99
    mpath=[]
    print '\tAll paths:',paths
    for path in paths:
        t=sum(graph[i][j] for i,j in zip(path,path[1::]))
        print '\t\tevaluating:',path, t
        if t<mt: 
            mt=t
            mpath=path

    e1=' '.join('{}->{}:{}'.format(i,j,graph[i][j]) for i,j in zip(mpath,mpath[1::]))
    e2=str(sum(graph[i][j] for i,j in zip(mpath,mpath[1::])))
    print 'Best path: '+e1+'   Total: '+e2+'\n'  

if __name__ == "__main__":
    graph = {'A': {'B':5, 'C':4},
             'B': {'C':3, 'D':10},
             'C': {'D':12},
             'D': {'C':5, 'E':9},
             'E': {'F':8},
             'F': {'C':7}}
    min_path(graph,'A','E')
    min_path(graph,'A','D')
    min_path(graph,'A','F')

打印：

    All paths: [['A', 'C', 'D', 'E'], ['A', 'B', 'C', 'D', 'E'], ['A', 'B', 'D', 'E']]
        evaluating: ['A', 'C', 'D', 'E'] 25
        evaluating: ['A', 'B', 'C', 'D', 'E'] 29
        evaluating: ['A', 'B', 'D', 'E'] 24
Best path: A->B:5 B->D:10 D->E:9   Total: 24

    All paths: [['A', 'C', 'D'], ['A', 'B', 'C', 'D'], ['A', 'B', 'D']]
        evaluating: ['A', 'C', 'D'] 16
        evaluating: ['A', 'B', 'C', 'D'] 20
        evaluating: ['A', 'B', 'D'] 15
Best path: A->B:5 B->D:10   Total: 15

    All paths: [['A', 'C', 'D', 'E', 'F'], ['A', 'B', 'C', 'D', 'E', 'F'], ['A', 'B', 'D', 'E', 'F']]
        evaluating: ['A', 'C', 'D', 'E', 'F'] 33
        evaluating: ['A', 'B', 'C', 'D', 'E', 'F'] 37
        evaluating: ['A', 'B', 'D', 'E', 'F'] 32
Best path: A->B:5 B->D:10 D->E:9 E->F:8   Total: 32

Answer 2

字典完全适合查找特定停靠点的值。但是，这并不存储有关停靠顺序的任何信息，因此您可能还需要单独的列表。我假设这些时间只是相邻停靠点之间的延迟，所以如果你需要计算在任意一对停止之间获得的总时间，你实际上可能会找到比列表和字典更方便的2元组列表：

train_times = [("A", 0), ("B", 15), ("C", 12), ("D", 9)]

注意：第一次总是大概是零，因为之前没有时间停止。或者，您可以将最终时间设为零并将值存储在上一个停止位置，但我假设这是第一个案例。

这使您可以非常简单地计算从A到C的总时间：

def get_total_time(start_stop, end_stop):
    total_time = None
    for stop, this_time in train_times:
        if total_time is None and stop == start_stop:
            total_time = 0
        if total_time is not None:
            total_time += this_time
            if stop == end_stop:
                return total_time
    raise Exception("stop not found")

print "Time from A to C: %d" % (get_total_time("A", "C"),)

这可以通过组合列表和字典来提高效率，但除非列表很长（至少数百次停止），否则它不会产生很大的差异。

此外，一旦你引入了许多相互连接的火车线路，事情会变得更加复杂。在这种情况下，您可以使用任意数量的数据结构 - 一个简单的示例是字典，其中键是停止，值是所有行上相邻停靠点的列表以及它们的时间。但是，找到一条路径需要一些稍微不重要（但很有名）的算法，这些算法超出了这个问题的范围。

在任何一种情况下，如果您需要快速查找这些值，您可以考虑预先计算所有停靠点之间的时间 - 这被称为图形的transitive closure（其中“图形”） sense是一个节点网络，而不是折线图或类似网站。

Answer 3

在这种情况下，dict是完全可以接受的。事实上，如果您无法使用某个数据库并且您有兴趣在将来重用该字典，那么您可以挑选该对象并在另一个脚本中使用它：

import pickle

output = open('my_pickled_dict', 'wb')
pickle.dumb(my_dict, output)
output.close

然后在你的下一个剧本中：

import pickle

my_pickled_file = open('my_pickled_dict', 'rb')
my_dict = pickle.load(my_pickled_file)

Answer 4

字典可能对此有好处，是的。但是如果你需要跟踪另一个旁边的停止，你可能需要其他东西，比如ordered dict，或者一个用于定义下一个停止的小类（以及你需要的其他数据）喜欢保持停止），甚至停止列表，因为它保持秩序。

根据您需要的访问权限类型或您希望对此数据执行的操作，其中任何一种都可以为您提供帮助。

祝你好运。

Answer 5

字典适合这样做。

您可以轻松访问给定公交车站（=钥匙）的时间（=值）。

time_per_stop = {"Stop A": 15, "Stop B": 12, "Stop C": 9}

当然，如果你有一个有限的少量巴士站，你也可以保留一个停止时间的列表或元组。

time_per_stop_list = [15, 12, 9]

Answer 6

字典是由一组密钥的哈希值存储的一组值。使用字典的优点是定位特定记录所需的时间，无论大小是否固定（通常称为O(1)）。或者，如果使用列表，则查找记录所需的时间将等于读取每个元素所需的时间（通常称为O(n)，其中n等于列表中元素的数量）。

火车路线的字典最佳数据结构？

6 个答案: