这是一个文件
APPLE: toronto, 2018, garden, tasty, 5
apple is a tasty fruit
>>>end
apple is a sour fruit
>>>end
grapes: america, 24, organic, sweet, 4
grapes is a sweet fruit
>>>end
这是一个文件,也有换行符。 我想用文件创建一个字典。就像这样
函数为def f(file_to: (TextIO))-> Dict[str, List[tuple]]
file_to是输入的文件名,它将返回字典,例如
{'apple': [('apple is a tasty fruit', 2018, 'garden', 'tasty', 5), (apple is a sour fruit)], 'grapes':['grapes is a sweet fruit', 24, 'organic', 5)]}
每种水果都是关键,它们的描述是在那里格式化的值。每个水果都以>>> end
结尾我尝试过
with open (file_to, "r") as myfile:
data= myfile.readlines()
return data
它使用/ n返回列表中的文件字符串,我想我可以使用strip()删除该字符串,并获取':'之前的元素作为键。
我尝试的代码是
from pprint import pprint
import re
def main():
fin = open('f1.txt', 'r')
data = {}
key = ''
parsed = []
for line in fin:
line = line.rstrip()
if line.startswith('>'):
data[key] = parsed
parsed = []
elif ':' in line:
parts = re.split('\W+', line)
key = parts[0].lower()
parsed += parts[2:]
else:
parsed.insert(0, line)
fin.close()
pprint(data)
main()
它没有给出正确的预期结果:(
答案 0 :(得分:1)
我认为您确实不需要re
和pprint
。我尝试了简单的列表理解和一些if语句。
def main:
data = {}
key = ''
parsed = []
for line in fin:
line = line.rstrip()
if line.startswith('>'):
continue # If we get a line which starts with a '>', we can skip that line.
elif ':' in line:
parts = line.strip().split(":")
key = parts[0].lower()
firstInfo = parts[1].split(",") # What we have to add in the value, after reading the next line
firstInfo.pop(0) # Removing the first element, The State name (as it is not required).
secondInfo = fin.readline().strip() # Reading the next line. It will be the first value in the list.
value = [secondInfo]
value.extend([x for x in firstInfo]) # Extending the value list to add other elements.
data[key] = value
print(data["apple"])
return data
如果您在此实现过程中遇到任何问题,我们将很乐意为您提供帮助。 (尽管这是自我解释:P)
答案 1 :(得分:1)
我对您的代码做了一些调整(我在上一篇文章中给了您)。我认为这可以提供您想要的更新数据。
数据:
APPLE: toronto, 2018, garden, tasty, 5
apple is a tasty fruit
>>>end
apple is a sour fruit
apple is ripe
>>>end
apple is red
>>>end
grapes: america, 24, organic, sweet, 4
grapes is a sweet fruit
>>>end
这是更新的代码:
import re
def main():
fin = open('f1.txt', 'r')
data = {}
for line in fin:
line = line.rstrip()
if line.startswith('>'):
if key not in data:
data[key] = [tuple(parts)]
elif re.match('^\w+:\s', line):
key, _, *parts = re.split('[:,]\s+', line)
else:
if key in data:
data[key].append(line)
else:
parts.insert(0, line)
fin.close()
for key in data:
if len(data[key]) > 1:
data[key][1] = tuple(data[key][1:])
del data[key][2:]
print(data)
main()
此修订后的数据和代码的输出为:
{'APPLE': [('apple is a tasty fruit', '2018', 'garden', 'tasty', '5'), ('apple is a sour fruit', 'apple is ripe', 'apple is red')], 'grapes': [('grapes is a sweet fruit', '24', 'organic', 'sweet', '4')]}