我有两个系列,并且想检查它们是否相等,并且可以接受'a'和'b'组合的条件
first = pd.Series(['a', 'a', 'b', 'c', 'd'])
second = pd.Series(['A', 'B', 'C', 'C', 'K'])
预期输出:
0 True
1 True
2 False
3 True
4 False
到目前为止,我知道eq
可以比较两个系列,但是我不确定如何包括条件
def helper(s1, s2):
return s1.str.lower().eq(s2.str.lower())
答案 0 :(得分:2)
您可以使用按位逻辑运算来包括其他逻辑。
就是这样:
condition_1 = first.str.casefold().eq(second.str.casefold())
condition_2 = first.str.casefold().isin(['a', 'b']) & second.str.casefold().isin(['a', 'b'])
result = condition_1 | condition_2
或使用numpy:
condition_1 = first.str.casefold().eq(second.str.casefold())
condition_2 = numpy.bitwise_and(
first.str.casefold().isin(['a', 'b']),
second.str.casefold().isin(['a', 'b'])
)
result = numpy.bitwise_or(condition_1, condition_2)
答案 1 :(得分:1)
您可以使用replace
将所有a
映射到b
:
def transform(s):
return s.str.lower().replace({'a':'b'})
transform(first).eq(transform(second))
答案 2 :(得分:0)
您可以如下指定“ ascii_distance”:
import pandas as pd
s1 = pd.Series(['a', 'a', 'b', 'c', 'd'])
s2 = pd.Series(['A', 'A', 'b', 'C', 'F'])
def helper(s1, s2, ascii_distance):
s1_processed = [ord(c1) for c1 in s1.str.lower()]
s2_processed = [ord(c2) for c2 in s2.str.lower()]
print(f'ascii_distance = {ascii_distance}')
print(f's1_processed = {s1_processed}')
print(f's2_processed = {s2_processed}')
result = []
for i in range(len(s1)):
result.append((abs(s1_processed[i] - s2_processed[i]) <= ascii_distance))
return result
ascii_distance = 2
print(helper(s1, s2, ascii_distance))
输出:
ascii_distance = 2
s1_processed = [97, 97, 98, 99, 100]
s2_processed = [97, 97, 98, 99, 102]
[True, True, True, True, True]