我正在使用python并尝试逐行读取文件并在JSON中添加这些行,但我需要检查该行是否以某个单词开头,然后在该单词后面将文本放入json直到找到该行再次从特定单词开始,
我有一系列这些特定名称:
names_array= ['Filan Fisteku','Fisteku Filan']
所以例如txt文件就像:
所以我想用这个txt做的json是:
{
"Filan Fisteku":["Said something about this , blla blla blla",
"then the Filan Fisteku speech goes on on the next line,",
"plus some other text."],
"Fisteku Filan":["This is another text from another guy which",
"i am trying to put in a json"]
}
我需要知道我是否可以通过递归来完成此操作,或者我该怎么做?
答案 0 :(得分:1)
您可以轻松完成此操作:
res = {}
with open('file.txt', 'r') as f:
for line in f.readlines():
for name in names_array:
if line.startswith(name):
if name not in res:
res[name] = [line]
else:
res[name].append(line)
也许您还需要在行的开头删除多余的字符(空格等),但可能不需要。
答案 1 :(得分:1)
您可以使用以下内容构建dict
:
names = {}
with open('yourfile') as fin:
lines = (line.strip().partition(': ') for line in fin)
for fst, sep, snd in lines:
if sep:
name = fst
names.setdefault(name, []).append(snd or fst)
给出了:
{'Filan Fisteku': ['Said something about this , blla blla blla then',
'the Filan Fisteku speech goes on on the next line, plus some other text.'],
'Fisteku Filan': ['This is another text from another guy which i am trying to put in a json.']}
然后json.dumps
names
。
答案 2 :(得分:0)
您可以使用标记来识别当前发言人。如果您在一行开头遇到新的扬声器,请更新标志。如果线路起始处没有扬声器,则线路将进入当前扬声器阵列。我已经创建了一个演示,检查它是否适合你,
speaker = ''
Filan_Fisteku = []
Fisteku_Filan = []
with open('yourfile.txt', 'r') as f:
for line in f.readlines():
if line.startswith('Filan Fisteku:'):
line = line.lstrip('Filan Fisteku:')
Filan_Fisteku.append(line.strip())
speaker = 'Filan Fisteku'
elif line.startswith('Fisteku Filan:'):
line = line.lstrip('Fisteku Filan:')
Fisteku_Filan.append(line.strip())
speaker = 'Fisteku Filan'
elif speaker == 'Filan Fisteku':
Filan_Fisteku.append(line.strip())
elif speaker == 'Fisteku Filan':
Fisteku_Filan.append(line.strip())
mydict = {'Filan Fisteku': Filan_Fisteku, 'Fisteku Filan': Fisteku_Filan}
Frome数据,mydict
将如下所示,
{'Filan Fisteku': ['Said something about this , blla blla blla then',
'the Filan Fisteku speech goes on on the next line, plus some other text.',
'plus some other text.'],
'Fisteku Filan': ['This is another text from another guy which',
'i am trying to put in a json.']}