我想比较两个字符串列表,即textSplitted和column1。
目前我正在遍历这两个列表,如果不一样,则column2和column3应该在其中包含连字符( - )。如果相同,则column2和column3的值应保留在该位置。
note1:column1,column2,column3最初具有相同的长度。
note2:column1永远不会包含textSplitted没有的元素。
textSplitted = ['wow','this','is','some','nice','text']
column1 = ['this','is','some','text']
column2 = ['A','B','C','D']
column3 = ['Q1','Q2','Q3','Q4',]
i = 0
j = 0
for item in textSplitted:
if textSplitted[i] == column1[j]:
i+=1
j+=1
elif textSplitted[i] != column1[j]:
column2.insert(j,"-")
column3.insert(j,"-")
i+=1
print(textSplitted)
print(column2)
print(column3)
这会产生输出:
['wow', 'this', 'is', 'some', 'nice', 'text']
['-', 'A', 'B', '-', 'C', 'D']
['-', 'Q1', 'Q2', '-', 'Q3', 'Q4']
但我想实现:
['wow', 'this', 'is', 'some', 'nice', 'text']
['-', 'A', 'B', 'C', '-', 'D']
['-', 'Q1', 'Q2', 'Q3', '-', 'Q4']
注意:如果我要向textSplitted添加额外元素,则输出结果为:列表索引超出范围错误。但是,如果column1是'out of'比较,那么textSplitted中的剩余元素应该在column2和column3中得到相应的连字符( - )。 E.g:
['wow', 'this', 'is', 'some', 'nice', 'text','yes','indeed']
['-', 'A', 'B', 'C', '-', 'D','-','-']
['-', 'Q1', 'Q2', 'Q3', '-', 'Q4','-','-']
答案 0 :(得分:1)
这应该这样做:
textSplitted = ['wow','this','is','some','nice','text','yes','indeed']
column1 = ['this','is','some','text']
column2 = ['A','B','C','D']
column3 = ['Q1','Q2','Q3','Q4',]
i = 0
j = 0
while j < len(column1):
if textSplitted[i] == column1[j]:
i+=1
j+=1
elif textSplitted[i] != column1[j]:
column2.insert(i,"-")
column3.insert(i,"-")
i+=1
while i< len(textSplitted):
column2.append("-")
column3.append("-")
i+=1
print(textSplitted)
print(column2)
print(column3)
打印:
['wow', 'this', 'is', 'some', 'nice', 'text', 'yes', 'indeed']
['-', 'A', 'B', 'C', '-', 'D', '-', '-']
['-', 'Q1', 'Q2', 'Q3', '-', 'Q4', '-', '-']
答案 1 :(得分:1)
这可能是也可能不是要求,但是如果来自column1的元素在textSplitted中出现多次,则发布的解决方案(当我查看它们时,可能它们现在已更新)将失败,例如:
textSplitted = ['wow','this','is','some','nice','text','yes','indeed','it','is']
column1 = ['this','is','some','text']
output will be:
['wow', 'this', 'is', 'some', 'nice', 'text', 'yes', 'indeed', 'it', 'is']
['-', 'A', 'B', 'C', '-', 'D', '-', '-', '-', '-']
['-', 'Q1', 'Q2', 'Q3', '-', 'Q4', '-', '-', '-', '-']
failing to pick up the repeated 'is'.
下面修复了潜在的问题:
textSplitted = ['wow','this','is','some','nice','text','yes','indeed','it','is']
column1 = ['this','is','some','text']
column2 = ['A','B','C','D']
column3 = ['Q1','Q2','Q3','Q4',]
a = list(map(lambda w: w if w in column1 else '-', textSplitted))
column2 = list(map(lambda w: w if w=='-' else column2[column1.index(w)], a))
column3 = list(map(lambda w: w if w=='-' else column3[column1.index(w)], a))
print(textSplitted)
print(column2)
print(column3)
['wow', 'this', 'is', 'some', 'nice', 'text', 'yes', 'indeed', 'it', 'is']
['-', 'A', 'B', 'C', '-', 'D', '-', '-', '-', 'B']
['-', 'Q1', 'Q2', 'Q3', '-', 'Q4', '-', '-', '-', 'Q2']
答案 2 :(得分:0)
你可以做得更简单:
j = 0
for i, word in enumerate(textSplitted):
if i >= len(column1):
break
if word != column1[i-j]:
column2.insert(i, '-')
column3.insert(i, '-')
j+= 1
答案 3 :(得分:0)
您必须在索引x处进行替换。
textSplitted = ['wow','this','is','some','nice','text']
column1 = ['this','is','some','text']
column2 = ['A','B','C','D']
column3 = ['Q1','Q2','Q3','Q4',]
i = 0
j = 0
for i in range(0, len(textSplitted)):
print i,textSplitted[i], j,column1[j]
if textSplitted[i] != column1[j]:
column2.insert(i,"-")
column3.insert(i,"-")
else:
j = j+1
print(textSplitted)
print(column2)
print(column3)
答案 4 :(得分:0)
在这种情况下,我更倾向于使用映射方法。所以这里有一个不同的解决方案,具有以下优点:
代码:
textSplitted = ['wow', 'this', 'is', 'some', 'nice', 'text','yes','indeed']
column1 = ['this','is','some','text']
column2 = ['A','B','C','D']
column3 = ['Q1','Q2','Q3','Q4',]
last_i = 0
mapper = []
for w in textSplitted:
try:
new_i = column1.index(w, last_i)
except ValueError:
mapper.append("-")
else:
mapper.append(new_i)
last_i = new_i+1
# mapper = ["-", 0, 1, 2, "-", 3, "-", "-"]
print (textSplitted)
print ([column2[i] if i is not "-" else "-" for i in mapper])
print ([column3[i] if i is not "-" else "-" for i in mapper])
>>>
['wow', 'this', 'is', 'some', 'nice', 'text', 'yes', 'indeed']
['-', 'A', 'B', 'C', '-', 'D', '-', '-']
['-', 'Q1', 'Q2', 'Q3', '-', 'Q4', '-', '-']
您可以尝试重复发生 - 避免第二个“文本”:
textSplitted = ['wow', 'this', 'is', 'some', 'nice', 'text','yes','indeed', 'text']
column1 = ['this','is','some','text']
column2 = ['A','B','C','D']
column3 = ['Q1','Q2','Q3','Q4',]
...
>>>
['wow', 'this', 'is', 'some', 'nice', 'text', 'yes', 'indeed', 'text']
['-', 'A', 'B', 'C', '-', 'D', '-', '-', '-']
['-', 'Q1', 'Q2', 'Q3', '-', 'Q4', '-', '-', '-']
甚至将第二个“文本”映射到正确的结果:
textSplitted = ['wow', 'this', 'is', 'some', 'nice', 'text','yes','indeed', 'text']
column1 = ['this','is','some','text', 'text']
column2 = ['A','B','C','D', 'E']
column3 = ['Q1','Q2','Q3','Q4','Q5']
...
>>>
['wow', 'this', 'is', 'some', 'nice', 'text', 'yes', 'indeed', 'text']
['-', 'A', 'B', 'C', '-', 'D', '-', '-', 'E']
['-', 'Q1', 'Q2', 'Q3', '-', 'Q4', '-', '-', 'Q5']