我在下面有2个列表
Token_Sentence=[['This','is','a','book'],['This','is','a','cat'],['Those','are','two','books']]
Mapping=[['This',1],['is',2],['a',3],['book',4],['cat',5],['Those',6],['are',7],['two',8],['books',9]]
我想像这样
映射Token_Sentence(将文本转换为索引号)[[1,2,3,4],[1,2,3,5],[6,7,8,9]]
这是我的代码
for a in range(len(Token_Sentence)):
for b in range(len(Token_Sentence[a])):
for c in range(len(Mapping)):
if Token_Sentence[a][b]==Mapping[c][0]:
Token_Sentence[a][b]=Mapping[c][1]
但问题是它需要很长时间才能运行(我的真实数据列表非常大)。
还有其他方法可以实现比我更快更简单的目标吗?
答案 0 :(得分:7)
您可以从Mapping
:
Token_Sentence=[['This','is','a','book'],['This','is','a','cat'],['Those','are','two','books']]
Mapping=[['This',1],['is',2],['a',3],['book',4],['cat',5],['Those',6],['are',7],['two',8],['books',9]]
d = dict(Mapping)
new_sentence = [[d[b] for b in i] for i in Token_Sentence]
输出:
[[1, 2, 3, 4], [1, 2, 3, 5], [6, 7, 8, 9]
答案 1 :(得分:0)
上面的答案很好,只是想表明你是否想要不转换为dict:
Token_Sentence=[['This','is','a','book'],['This','is','a','cat'],['Those','are','two','books']]
Mapping=[['This',1],['is',2],['a',3],['book',4],['cat',5],['Those',6],['are',7],['two',8],['books',9]]
print([[k[1] for j in i for k in Mapping if j==k[0]] for i in Token_Sentence ])
输出:
[[1, 2, 3, 4], [1, 2, 3, 5], [6, 7, 8, 9]]