Question

我想在文本文件中保存数据并从这些文件中创建字典，稍后我会将其传递给函数。

这是我的代码：

def lesson_dictionary(filename):
    print "Reading file ", filename
    with open(filename) as f:
        mylist = f.read().strip().split() 
        dictionary = OrderedDict(zip(mylist[::2], mylist[1::2])) #keep keys/values in same order as declared in mylist
        print dictionary
    return dictionary

使用名为sample.txt的示例文件，其中包含由空格分隔的两列键/值对，它可以正常工作。例如，

a b

c d

e f

产生如下列表：

OrderedDict([('a', 'b'), ('c', 'd'), ('e', 'f')])

但如果我更改.txt文件的代码和内容，它就会中断。例如，如果包含sample2.txt：

A：B

C：d

e：f

我的代码是

def lesson_dictionary(filename):
    print "Reading file ", filename
    with open(filename) as f:
        mylist = f.read().strip().split(':') #CHANGED: split string at colon!
        dictionary = OrderedDict(zip(mylist[::2], mylist[1::2]))
        print dictionary
    return dictionary

我得到以下输出：

OrderedDict([('a', 'b \nc'), ('d\ne', 'f')])

发生了什么事？为什么strip（）适用于第一个.txt文件但不适用于第二个文件？提前感谢您的帮助。

Answer 1

原始split()分隔在空格上，\n被视为空格。通过更改为split(':')，您已删除了行尾的拆分，因此一行的结尾与下一行的开头合并，中间有一个额外的换行符。我不认为有一种简单的方法可以解决它，除了一次读取一行文件。

编辑：要演示的一些代码。

dictionary = OrderedDict()
with open(filename) as f:
    for line in f:
        key, value = line.split(':')
        dictionary[key.strip()] = value.strip()

或者更符合原作的精神：

with open(filename) as f:
    mylist = [line.strip().split(':') for line in f]
    dictionary = OrderedDict(mylist)

第二种形式的缺点是不会自动从单词周围剥离空白。根据您的示例，您可能需要它。

Answer 2

没有分隔符的

split()在空格上分割，这是新行和制表符/空格。当您在冒号上split时，该算法不再适用，因此输出中会显示换行符。尝试：

dictionary = Ordereddict(l.strip().split(':') for l in f)

Answer 3

如果您自己创建输入文件，我相信json会更适合此问题。

你可以像这样使用它：

import json

#write the dictionary to a file
outfile = open(filename, 'w')
json.dump(someDictionary, outfile)

#read the data back in
with open(filename) as infile:
    newDictionary = json.load(infile)

Answer 4

您是否尝试过打印myList的内容？

myList = ["a", "b c", "d e", "f"]

如果您希望它们的行为方式相同，请先用空格替换冒号：

myList = f.read().replace(":", "").split()

或者，如果你想将它们分成键值对，只需使用字符串切片将偶数和奇数元素压缩在一起：

s = f.read().split()
myDict = dict(zip(s[::2], s[1::2]))

Answer 5

If you want your code to be delimiter neutral, i.e a:b, a-b, a#b and such. Instead of regular split() use re.split().

import re
pattern = re.compile(r"[^\w]")     # non-w char
with open(filename, "rt") as fr:
    return OrderedDict(pattern.split(l.strip()) for l in fr)

Python：需要帮助从文本文件创建字典并拆分列表

5 个答案: