因此,我必须编写一个从txt文件读取的代码,然后将txt转换为列表以对其进行分析(转换单位)。我要做的是从列表中的特定单词中删除标点符号,以便对其进行分析,然后将其放在与以前相同的位置。此列表可以随时更改,因为代码必须适用于我提供的每个txt。
如何将符号准确地放回之前的位置?我不能为此使用任何包装。
punct=['?', ':', ';', ',', '.', '!','"','/']
size = {'mm':1, 'cm':10, 'm':100, 'km':1000}
with open('random_text','r') as x:
LIST=x.read().split()
for item in LIST:
if item[:-1] in size.keys() and item[-1] in punct:
punct_item=item
non_punct_item=item[:-1]
symbol=item[-1]
答案 0 :(得分:2)
读取文件不会对其进行任何更改,因此,如果您一次读取文件并进行了所需的所有修改(在这种情况下,请删除标点符号)。然后,当您再次需要标点符号时,只需再次重新读取文件,所有内容都应放在同一位置。
一种更快的方法是:
punct=['?', ':', ';', ',', '.', '!','"','/']
size = {'mm':1, 'cm':10, 'm':100, 'km':1000}
# Do all modifications you need
words_temp = None
with open('file1.txt','r') as file:
words = file.read().split()
words_temp = words
for item in words:
if item[:-1] in size.keys() and item[-1] in punct:
punct_item=item
non_punct_item=item[:-1]
symbol=item[-1]
words = words_temp
del words_temp
那是更简单的方法,另一种方法是实现一个字典,其键是要删除的字符的索引,而值是字符本身。对于这种方法,您将需要对整个文件进行一次迭代以构建此字典,然后再次进行迭代以将其添加回去。 示例代码...
tracker = dict()
punct=['?', ':', ';', ',', '.', '!','"','/']
words = list("If it walks like a duck, and it quacks like a duck, then it must be a duck. I love python!")
print("".join(words))
# If it walks like a duck, and it quacks like a duck, then it must be a duck. I love python!
# Removing the punct.
i = 0
while i < len(words):
if words[i] in punct:
tracker[i+len(tracker.keys())] = words[i]
words.pop(i)
i+=1
print("".join(words))
# If it walks like a duck and it quacks like a duck then it must be a duck I love python
# Adding the punct back
for k,v in tracker.items():
words = words[:k] + [v] + words[k:]
print("".join(words))
# If it walks like a duck, and it quacks like a duck, then it must be a duck. I love python!