例如,一个对象包含this_sub_string
:
>> repr(this_sub_string)
u'1 \u03bcM rosiglitazone / 0.1% DMSO for 1 hour'`
在description
字段中。但是当我使用objects.all.filter(Q(description__contains=this_sub_string))
来搜索这样的对象时,没有返回任何结果。
当我在\u03bcM
之前删除字符串进行搜索时,我可以得到正确的结果
>>> models.Samples.objects.filter(description__contains=u' rosiglitazone / 0.1% DMSO for 1 hour')
[<Samples: 40171_GSM1199141_CBP CHIP-SEQ, 3T3-L1 DAY7 1H ROSI_Mus musculus>, <Samples: 40172_GSM1199139_RNAPII CHIP-SEQ, 3T3-L1 DAY7 1H ROSI REP2_Mus musculus>, <Samples: [Bad Unicode data]>, <Samples: 40176_GSM\
1199143_INPUT, 3T3-L1 DAY7 1H ROSI_Mus musculus>, <Samples: [Bad Unicode data]>, <Samples: 40180_GSM1199133_MED1 CHIP-SEQ, 3T3-L1 DAY7 1H ROSI_Mus musculus>, <Samples: [Bad Unicode data]>, <Samples: 40185_GSM119\
9137_RNAPII CHIP-SEQ, 3T3-L1 DAY7 1H ROSI REP1_Mus musculus>]
>>> models.Samples.objects.filter(description__contains=u' rosiglitazone / 0.1% DMSO for 1 hour')[0].description
u'{"source name": "3T3-L1 adipocytes (Day 7)", "treatment": "1 \\u03bcM rosiglitazone / 0.1% DMSO for 1 hour", "cell type": "3T3-L1 adipocytes", "chip antibody": "anti-CBP (sc-369; Santa Cruz)"}'
有没有人有这方面的想法?感谢..
答案 0 :(得分:1)
这部分让我觉得你不是存储字符串本身,而是存储包含此字符串或dict
转储的json
的表示。
>>> models.Samples.objects.filter(description__contains=u' rosiglitazone / 0.1% DMSO for 1 hour')[0].description
u'{"source name": "3T3-L1 adipocytes (Day 7)", "treatment": "1 \\u03bcM rosiglitazone / 0.1% DMSO for 1 hour", "cell type": "3T3-L1 adipocytes", "chip antibody": "anti-CBP (sc-369; Santa Cruz)"}'
在这种情况下,Unicode字符将在数据库中显示为\u03bcM
但不包含utf-8字符
要使用此字符串进行搜索,我们应该以相同的方式处理它(json
转储或repr
)
答案 1 :(得分:0)
我发现黑客可以解决这个问题
from _json import encode_basestring_ascii as c_encode_basestring_ascii
>>> c_encode_basestring_ascii(a)
'"1 \\u03bcM rosiglitazone / 0.1% DMSO for 1 hour"'
然后查询可以返回正确的结果..