我有以下数据:
data = """
item: apple
store name: USA_1
store id: 1000
total: 200
item: apple
store name: USA_2
store id: 1001
total: 230
item: apple
store name: USA_3
store id: 1002
total: 330
item: apple
store name: UK1
store id: 2000
total: 20
item: apple
store name: UK_2
store id: 1021
total: 230
"""
我必须获得如下所示的存储字典格式:
{' USA_1': ' 1000', ' USA_2': ' 1001', ' USA_3': ' 1002', ' UK1': ' 2000', ' UK_2': ' 1021'}
我写了下面的代码,这些代码将获得上面的输出:
STORE_NAME_GATHERED = []
STORE_IDS_GATHERED = []
STORE_info = {}
for line in data.split("\n"):
line = line.strip()
if line.startswith("store name:"):
name = line.split(":")[1]
if not name in STORE_NAME_GATHERED:
STORE_NAME_GATHERED.append(name)
elif line.startswith("store id:"):
id = line.split(":")[1]
if not id in STORE_IDS_GATHERED:
STORE_IDS_GATHERED.append(id)
STORE_info[name] = id
print(STORE_info)
我从上面的代码中获得了预期的结果,但是,实现上述输出并获得可靠的结果并不是最好的代码,有人可以帮助我以正确的代码以可靠的方式实现相同的结果
答案 0 :(得分:5)
使用regex
例如:
import re
data = """
item: apple
store name: USA_1
store id: 1000
total: 200
item: apple
store name: USA_2
store id: 1001
total: 230
item: apple
store name: USA_3
store id: 1002
total: 330
item: apple
store name: UK1
store id: 2000
total: 20
item: apple
store name: UK_2
store id: 1021
total: 230
"""
name = re.findall(r"store name: (.*)", data) #Get Store Name
store = re.findall(r"store id: (.*)", data) #Get Store ID
print(dict(zip(name, store)))
输出:
{'UK1': '2000',
'UK_2': '1021',
'USA_1': '1000',
'USA_2': '1001',
'USA_3': '1002'}