Question

当字符串之一是str类型而另一个是unicode类型时，对字符串执行操作是否错误？

示例：

image_url = u"http://sample.com"

# since the iamge url is an unicode string

if image_url.startswith("//"):
    image_url = "https://" + image_url    // combining str type with a unicode string

或

image_url = "http://sample.com"
if image_url.startswith(u"//"):
    image_url = "https://" + image_url

或

image_url = "http://sample.com"
if image_url.startswith("//"):
    image_url = u"https://" + image_url

或使用正则表达式替换字符串：

cleaned_breadcrumb = re.sub(r"[^A-Za-z0-9>|]+", u"", u"sample text")

或

cleaned_breadcrumb = re.sub(r"[^A-Za-z0-9>|]+", "", u"sample text")

或

cleaned_breadcrumb = re.sub(r"[^A-Za-z0-9>|]+", u"", "sample text")

或

d = {u"one":"two"}

if "one" in d:
    print("yes")

Answer 1

两者都是double avg; avg = printAverage(age, n); aboveAverage(age, n, avg);的子类，所以没有。如您所知，混合类型的表达式将被强制转换为unicode。没错，但这可能会导致一些意外，尤其是在对文件进行文本IO时。这两个惊喜都是Python 2字符串中数据的模棱两可的本质所固有的。唯一的完整解决方案是转到Python 3。

Answer 2

从技术上讲，这不是违法的，但是，这是使代码难以维护的绝对方法（wrt /可读性和可预测性）。在Python2中，最安全的最佳选择是“ unicode三明治”模式：对所有文本输入（文件/ IO / http请求和响应/ sys args /用户输入等）进行解码以尽快对unicode进行编码，并让您的程序 all 代码仅对unicode字符串有效，并且仅在输出之前编码回字节字符串（具有所需的编码）。

如何在python2中对具有不同类型的不同字符串进行操作？

2 个答案: