我有两个包含字符串和子字符串的pandas DataFrame:
import pandas as pd
strings = pd.DataFrame(
[
{"id": 0, "string": "abcdef"},
{"id": 1, "string": "bcdef"},
{"id": 2, "string": "cdef"}
]
)
substrings = pd.DataFrame(
[
{"id": 0, "string": "a"},
{"id": 1, "string": "bc"},
{"id": 2, "string": "def"}
]
)
我想找到每个字符串中每个子字符串所有出现的索引。现在,我正在做类似的事情
substrings.apply(
lambda substring: strings["string"].findall(substring.string),
axis=1
)
是否有更好/更有效的方法来做到这一点?
答案 0 :(得分:0)
我相信您需要:
s = strings["string"].str.findall('|'.join(substrings.string))
print (s)
0 [a, bc, def]
1 [bc, def]
2 [def]
Name: string, dtype: object