Question

我在python中有一个函数，它接受以下形式的txt文件：

我的json文件结果是以下

所以我想知道的是：

1）为什么在\ n字符的2位数字以上，以及1位数字的字符\ n仍然存在。那有什么解决方案吗？请记住，我希望删除所有新的换行符。

2）为什么我得到一个警告，我的行变量未被使用（我已经在代码的开头发表了评论）。虽然我认为它正在做它应该做的事情。

这是我的代码

def create_dict_from_index_txt(file_name):

# Create a dynamical dictionary from the input file
num_of_lines = 0
words = []

# find how many lines there are in the files
with open(file_name, 'r') as f:
    for line in f: # I get a warning that line is unused
        num_of_lines += 1
print("Number of lines: ", num_of_lines)
f.close()

f1 = open(file_name, 'r')
# find how many arguments each line has
for i in range(num_of_lines):
    words_per_line = f1.readline().split(" ")
    words.append(len(words_per_line))
print("Number of columns per line: ", words)

# Initialize the saving space of lines I want
a = [0] * num_of_lines
# Initialize the saving space of columns in each line
for i in range(num_of_lines):
    a[i] = [0] * words[i]
print("Initialized a: ", a)
f1.close()

f1 = open(file_name, 'r')
# Getting the info from each line and fill in the a 2d list
for i in range(num_of_lines):
    ln = f1.readline().split(" ")
    for j in range(words[i]):
        a[i][j] = ln[j]
print("First tokenize of index.txt: ", a)
f1.close()

# Delete the new line delimiter parsing only the last element of each row
for i in range(num_of_lines):
    inner_index = words[i]-1
    tok = a[i][inner_index]
    if "\n" in tok:
        a[i][inner_index] = a[i][inner_index][:2] # <------ HERE IS THE [:2]
print("Attempt to delete new lines", a)

# Initialize the saving space for keys and Extract only the keys of the 2d list (a)
keys = [0] * num_of_lines
for i in range(num_of_lines):
    keys[i] = a[i][0]
print("The keys are: ", keys)

# Initialize the saving space for the ids
ids = [0] * num_of_lines
for i in range(num_of_lines):
        ids[i] = [0] * (words[i]-1)
print("Initialized ids: ", ids)

# extract the ids of the 2d list (a)
for i in range(num_of_lines):
    for j in range(1, words[i]):
        ids[i][j-1] = a[i][j]

print("Only ids of each word: ", ids)

dictionary = {}
# create a dictionary dynamically
for i in range(num_of_lines):
    dictionary.update({keys[i]: ids[i]})

print("The final dictionary of the input text file is: ", dictionary)
# End of creating a dynamical dictionary

return dictionary

请记住，我是Python的新手，我仍在学习基础知识。

Answer 1

好的，你是Python的新手，但你认为它错了。 Python中的默认序列和映射类是动态列表，可以附加到。所以这里的Pythonic方式是：

初始化一个空字典
一次读取一行文件，删除行尾，并对其进行标记
- 第一个字是关键
- 剩下的字是ids
- 将处理后的行添加到词典

所以代码可以简单：

dictionary = {}
with open(file_name, 'r') as f1:      # with ensure that the file will be close at end of block
    for line in f1:
        words = line.strip().split()  # trim white spaces (including end of lines from both ends
                                      # split on spaces
        dictionary[words[0]] = words[1:]  # add to final dictionnary

print dictionary                      # control correct processing

为什么[：2]不适用于Python中小于10的数字？

1 个答案: