Question

我收到了经典的codex编码错误：

UnicodeEncodeError: 'ascii' codec can't encode characters in position 11-12: ordinal not in range(128)

来自以下代码：

def compare_handles(handle):
    new_df = df[df['Creation Specifications'].astype(str).str.contains(handle)]

初始数据框来自excel文件，其中列包含字符串，特殊字符和超链接。我一直有这条线的初步问题，直到我添加了astype（str）。但是，在代码中进一步向下抛出了这个错误。

我的问题是当我不得不使用str.contains函数时，如何编码为更好的codex？鉴于documentation关于此问题的基础知识，关键是删除str部分并添加编码。但是，因为我正在使用pandas函数contains，所以它不可能“只是删掉str”。

我可以创建创建映射，但我想知道是否有更清晰的答案。

注意：我确实尝试在上面的代码行中的每个可能位置使用编码utf-8和拉丁语。

Answer 1

你可以加倍大熊猫线。

new_df = complete_df[complete_df['Creation Specifications'].str.encode('utf-8', errors='ignore').str.contains(handle)]