以下是我正在处理的代码,我无法从名称列表中提取名字和姓氏。代码一直给我错误太多的值来解压缩可能是因为例如这个名字ELSWOCK Rick Jr有第一个中间名和姓。 Rick Jr应该是第一个名字,而Elswock应该是姓氏。
names=[' HE XF, Wei W, Liu ZZ, Shen XL',' STARK LE, AARON FIN, LEO DE CAP, ADAM FORTH, KARAN SINGH',' ELSWICK RICK Jr, ASTO FON, SAM MARLON, KIM ZENG']
names1 = []
for l1 in names:
names1.append(l1.split(',')) #To split the line based on commas
first_names=[]
last_names=[]
for line in names1:
last,first= line[0][:].split()
first_names.append(first)
last_names.append(last)
导致此错误:
追踪(最近的呼叫最后):
文件"",第10行,in last,first = line [0] [:]。split()ValueError:解压缩的值太多(预期2)
我期待的输出是这样的:
first_names=[ 'XF, W, ZZ, XL', 'LE, FIN, CAP, FORTH, SINGH', 'RICK Jr, FON, MARLON, ZENG' ]
last_names=[' HE, Wei, Liu, Shen',' STARK, AARON, LEO DE, ADAM, KARAN',' ELSWICK, ASTO, SAM, KIM']
答案 0 :(得分:2)
编辑以符合OP的格式要求:
names=[' HE XF, Wei W, Liu ZZ, Shen XL',' STARK LE, AARON FIN, LEO DE CAP, ADAM FORTH, KARAN SINGH',' ELSWICK RICK Jr, ASTO FON, SAM MARLON, KIM ZENG']
names1 = []
for l1 in names:
names1.append(l1.split(','))
first_names=[]
last_names=[]
for sub_list in names1:
temp_sub_firsts =""
temp_sub_lasts =""
for full_name in sub_list:
full_name_split = full_name.split(' ')
full_name_split.pop(0)
temp_sub_lasts += full_name_split.pop(0)
if full_name != sub_list[-1]:
temp_sub_lasts += ', '
temp_first = ""
for sub_first in full_name_split:
temp_first += sub_first + ' '
temp_sub_firsts += temp_first
if full_name != sub_list[-1]:
temp_sub_firsts += ', '
first_names.append(temp_sub_firsts)
last_names.append(temp_sub_lasts)
print(first_names)
print(last_names)
输出:
first_names[]=
['XF , W , ZZ , XL ', 'LE , FIN , DE CAP , FORTH , SINGH ', 'RICK Jr , FON , MARLON , ZENG ']
last_names[]=
['HE, Wei, Liu, Shen', 'STARK, AARON, LEO, ADAM, KARAN', 'ELSWICK, ASTO, SAM, KIM']
答案 1 :(得分:2)
你也可以尝试这个
names=[' HE XF, Wei W, Liu ZZ, Shen XL',' STARK LE, AARON FIN, LEO DE CAP, ADAM FORTH, KARAN SINGH',' ELSWICK RICK Jr, ASTO FON, SAM MARLON, KIM ZENG']
reg1=re.compile(r"\w+(?<!,)\s(?=(?!Jr)[\w ]+,?)")
reg2=re.compile(r'(?<!,)\s(?:(?!Jr|DE)[\w ]+(?=,?))')
first_names=[reg1.sub("",m.strip()) for m in names]
last_names=[reg2.sub("",m.strip()) for m in names]
print("{}\n{}".format(first_names,last_names))
输出
['XF, W, ZZ, XL', 'LE, FIN, CAP, FORTH, SINGH', 'RICK Jr, FON, MARLON, ZENG']
['HE, Wei, Liu, Shen', 'STARK, AARON, LEO DE, ADAM, KARAN', 'ELSWICK, ASTO, SAM, KIM']