将for循环中的多行合并到一个列表中

时间:2019-06-27 17:11:11

标签: json python-3.x list

所以基本上我的脚本读取并解析了一个JSON文件。

JSON文件:

{
"messages": 
[
    {"timestamp": "123456789", "timestampIso": "2019-06-26 09:51:00", "agentId": "2001-100001", "skillId": "2001-20000", "agentText": "That customer was great"},
    {"timestamp": "123456789", "timestampIso": "2019-06-26 09:55:00", "agentId": "2001-100001", "skillId": "2001-20001", "agentText": "That customer was stupid\nI hope they don't phone back"},
    {"timestamp": "123456789", "timestampIso": "2019-06-26 09:57:00", "agentId": "2001-100001", "skillId": "2001-20002", "agentText": "Line number 3"},
    {"timestamp": "123456789", "timestampIso": "2019-06-26 09:59:00", "agentId": "2001-100001", "skillId": "2001-20003", "agentText": ""}
]
}

我有一个python脚本,可去除“ agentText”,并且for循环逐行打印出每个对象

import json

    with open('20190626-101200-text-messages.json') as f:
      data = json.load(f)

    for message in data['messages']:
        splittext= message['agentText'].strip().replace('\n',' ').replace('\r',' ')
        if len(splittext)>0:
           print(splittext)

这给了我

That customer was great
That customer was stupid I hope they don't phone back
Line number 3

我需要将这些单独的行附加在一起,以便读取:

That customer was great That customer was stupid I hope they don't phone back Line number 3

因此,我可以对其应用一些停用词/ nltk。该怎么办?

2 个答案:

答案 0 :(得分:2)

您可以将所有行连接为一个字符串变量:

res = ""
for message in data['messages']:
    splittext= message['agentText'].strip().replace('\n',' ').replace('\r',' ')
    if len(splittext)>0:
       res += splittext + " "

或者在列表的帮助下使用字符串方法:

res = []
for message in data['messages']:
    splittext= message['agentText'].strip().replace('\n',' ').replace('\r',' ')
    if len(splittext)>0:
       res.append(splittext)
print(" ".join(res))

答案 1 :(得分:1)

使用对str.joinstr.splitlines的理解

例如:

data = {
"messages": 
[
    {"timestamp": "123456789", "timestampIso": "2019-06-26 09:51:00", "agentId": "2001-100001", "skillId": "2001-20000", "agentText": "That customer was great"},
    {"timestamp": "123456789", "timestampIso": "2019-06-26 09:55:00", "agentId": "2001-100001", "skillId": "2001-20001", "agentText": "That customer was stupid\nI hope they don't phone back"},
    {"timestamp": "123456789", "timestampIso": "2019-06-26 09:57:00", "agentId": "2001-100001", "skillId": "2001-20002", "agentText": "Line number 3"},
    {"timestamp": "123456789", "timestampIso": "2019-06-26 09:59:00", "agentId": "2001-100001", "skillId": "2001-20003", "agentText": ""}
]
}

print(" ".join(j for msg in data["messages"] for j in msg["agentText"].splitlines()))

输出:

That customer was great That customer was stupid I hope they don't phone back Line number 3