我有一个长字符串对象,其格式如下
myString = “[name = john, family = candy, age = 72],[ name = jeff, family = Thomson, age = 24]”
当然字符串比这长。 我还有3个相关名单:
Names = []
Families = []
Ages = []
我希望逐个字符地读取该字符串并获取数据并将其附加到适当的列表中。任何人都可以帮我解决如何将字符串分成变量的问题吗? 我需要的是这样的事情:
Names = [“john”, “jeff”, ...]
Families = [“candy”, “Thomson”, ...]
Ages = [72, 24, ...]
答案 0 :(得分:5)
使用正则表达式可以轻松完成。基本上,构造一个正则表达式,从字符串中提取名称,族和年龄,并从返回的tuple
中提取相关数据,以构建您的list
。
import re
if __name__=='__main__':
myString = "[name = john adams, family = candy, age = 72],[ name = jeff, family = Thomson, age = 24]"
answers=re.findall("\\[\\s*name = ([^,]+), family = (\\w+), age = (\\d+)\\]",myString)
names=[x[0] for x in answers]
families=[x[1] for x in answers]
ages=map(int,(x[2] for x in answers))
print "names: ",names
print "families: ", families
print "ages: ", ages
答案 1 :(得分:3)
import re
Names = []
Families = []
Ages = []
myString = "[name = john, family = candy, age = 72],[ name = jeff, family = Thomson, age = 24"
myregex = re.compile("name = (?P<name>.*?), family = (?P<family>.*?), age = (?P<age>.*)")
for list_ in myString.split(']'):
found = re.search(myregex, list_).groupdict()
Names.append(found['name'])
Families.append(found['family'])
Ages.append(int(found['age']))
答案 2 :(得分:1)
打破问题:
你会遇到问题,因为逗号之间的实体不是很好的词典。
答案 3 :(得分:1)
您应该将其解析为字典列表,而不是三个不同列表,仅与数据顺序相关联。
就像data = [ {"name": "John", "family": "Candy", "age": 72 }, ...]
如果你不能改变数据源,一种可能性就是用split这样的字符串方法做一些天真的解析:
myString = "[name = john, family = candy, age = 72],[ name = jeff, family = Thomson, age = 24]"
data = []
for block in myString.split("]"):
if not block: break
block = block.split("[")[1]
entry_dict = {}
for part in block.split(","):
key, value = part.split("=")
key = key.strip()
value = value.strip()
if key == "age": value = int(value)
entry_dict[key] = value
data.append (entry_dict)
或者,如果您使用的是python 2.7(或3.1)并且想要更短的代码,则可以使用dict生成器 (您也可以在其他版本中使用生成器,只需创建元组列表并添加“dict”调用):
myString =“[name = john,family = candy,age = 72],[name = jeff,family = Thomson,age = 24]”
data = []
for block in myString.split("]"):
if not block: break
block = block.split("[")[1]
entry_dict = {}
data.append ({(part.split("=")[0].strip(), part.split("=")[1].strip()) for part in block.split(",") })
(在这个版本中没有将“年龄”转换为数字,但是)