为什么[:2]不适用于Python中小于10的数字?

时间:2018-04-24 07:28:50

标签: python-3.x

我在python中有一个函数,它接受以下形式的txt文件: Enter image description here

我的json文件结果是以下enter image description here

所以我想知道的是:

1)为什么在\ n字符的2位数字以上,以及1位数字的字符\ n仍然存在。那有什么解决方案吗?请记住,我希望删除所有新的换行符。

2)为什么我得到一个警告,我的行变量未被使用(我已经在代码的开头发表了评论)。虽然我认为它正在做它应该做的事情。

这是我的代码

def create_dict_from_index_txt(file_name):

# Create a dynamical dictionary from the input file
num_of_lines = 0
words = []

# find how many lines there are in the files
with open(file_name, 'r') as f:
    for line in f: # I get a warning that line is unused
        num_of_lines += 1
print("Number of lines: ", num_of_lines)
f.close()

f1 = open(file_name, 'r')
# find how many arguments each line has
for i in range(num_of_lines):
    words_per_line = f1.readline().split(" ")
    words.append(len(words_per_line))
print("Number of columns per line: ", words)

# Initialize the saving space of lines I want
a = [0] * num_of_lines
# Initialize the saving space of columns in each line
for i in range(num_of_lines):
    a[i] = [0] * words[i]
print("Initialized a: ", a)
f1.close()

f1 = open(file_name, 'r')
# Getting the info from each line and fill in the a 2d list
for i in range(num_of_lines):
    ln = f1.readline().split(" ")
    for j in range(words[i]):
        a[i][j] = ln[j]
print("First tokenize of index.txt: ", a)
f1.close()

# Delete the new line delimiter parsing only the last element of each row
for i in range(num_of_lines):
    inner_index = words[i]-1
    tok = a[i][inner_index]
    if "\n" in tok:
        a[i][inner_index] = a[i][inner_index][:2] # <------ HERE IS THE [:2]
print("Attempt to delete new lines", a)

# Initialize the saving space for keys and Extract only the keys of the 2d list (a)
keys = [0] * num_of_lines
for i in range(num_of_lines):
    keys[i] = a[i][0]
print("The keys are: ", keys)

# Initialize the saving space for the ids
ids = [0] * num_of_lines
for i in range(num_of_lines):
        ids[i] = [0] * (words[i]-1)
print("Initialized ids: ", ids)

# extract the ids of the 2d list (a)
for i in range(num_of_lines):
    for j in range(1, words[i]):
        ids[i][j-1] = a[i][j]

print("Only ids of each word: ", ids)

dictionary = {}
# create a dictionary dynamically
for i in range(num_of_lines):
    dictionary.update({keys[i]: ids[i]})

print("The final dictionary of the input text file is: ", dictionary)
# End of creating a dynamical dictionary

return dictionary

请记住,我是Python的新手,我仍在学习基础知识。

1 个答案:

答案 0 :(得分:2)

好的,你是Python的新手,但你认为它错了。 Python中的默认序列和映射类是动态列表,可以附加到。所以这里的Pythonic方式是:

  • 初始化一个空字典
  • 一次读取一行文件,删除行尾,并对其进行标记
    • 第一个字是关键
    • 剩下的字是ids
    • 将处理后的行添加到词典

所以代码可以简单:

dictionary = {}
with open(file_name, 'r') as f1:      # with ensure that the file will be close at end of block
    for line in f1:
        words = line.strip().split()  # trim white spaces (including end of lines from both ends
                                      # split on spaces
        dictionary[words[0]] = words[1:]  # add to final dictionnary

print dictionary                      # control correct processing