So far I am doing something like this:
def is_utf8(s):
try:
x=bytes(s,'utf-8').decode('utf-8', 'strict')
print(x)
return 1
except:
return 0
the only problem is that I don't want it to print anything, I want to delete the print(x)
and when I do that, the function stops functioning correctly.
For example if I do : print(is_utf8("H�tst"))
while the print is in the function it returns 0 otherwise it prints 1. Am i approaching the problem in a wrong way
答案 0 :(得分:2)
You could use the chardet module to detect an unknown encoding. For example if a
is a byte array then you could determine the encoding like this:
import chardet
b = chardet.detect(a)
print(b["encoding"])