我有两个列表
a = ["hi", "hello", "hey"]
b = ["Sam", "dean"]
和包含列ques
df = pd.DataFrame({'ques':["<input1> This is <input2>", "<input1> Sir, Do you know <input2>?"]})
我想用列表<input1>
的元素替换a
,并用列表<input2>
的元素替换b
,并创建一组独特的问题。
所以我的预期输出是:
['hi This is Sam',
'hi This is dean',
'hello This is Sam',
'hello This is dean',
'hey This is Sam',
'hey This is dean',
'hi Sir, Do you know Sam?',
'hi Sir, Do you know dean?',
'hello Sir, Do you know Sam?',
'hello Sir, Do you know dean?',
'hey Sir, Do you know Sam?',
'hey Sir, Do you know dean?']
我可以使用list
或pandas column
。
我尝试过的事情
from itertools import product
c = list(product(a,b))
ques = []
for q in df['ques']:
for i in c:
temp = q.replace("<input1>", i[0]).replace("<input2>", i[1])
ques.append(temp)
这给了我预期的结果,但是我的数据太大了,所以我正在寻找更有效的解决方案。
答案 0 :(得分:2)
您可以结合使用product
和replace
:
dfs = [
df.replace({'ques': {'<input1>': x, '<input2>': y}}, regex=True)
for x, y in itertools.product(a, b)
]
pd.concat(dfs, ignore_index=True)
ques
0 hi This is Sam
1 hi Sir, Do you know Sam?
2 hi This is dean
3 hi Sir, Do you know dean?
4 hello This is Sam
5 hello Sir, Do you know Sam?
6 hello This is dean
7 hello Sir, Do you know dean?
8 hey This is Sam
9 hey Sir, Do you know Sam?
10 hey This is dean
11 hey Sir, Do you know dean?