我试图弄清楚如何使用\uXXXX
格式在Python 3中打印随机Unicode字符,其中每个X
是[0-F]
中的字符。这是我到目前为止的内容:
import random
chars = '0123456789ABCDEF'
L = len(chars)
fourRandInts = [random.randint(0,L-1) for i in range(4)]
fourRandChars = [chars[i] for i in fourRandInts]
s = r'\u{}{}{}{}'.format(*fourRandChars)
string = "print(u'{}')".format(s)
exec(string)
这似乎可行,但我宁愿避免使用exec
。有没有更Python化的方法可以做到这一点?
编辑:根据标题判断,该问题似乎与#1477294 "Generate random UTF-8 string in Python"相同,但该问题在编辑中被重新表述,以使那里的答案通常无法回答原来的问题,他们也没有回答这个问题。
答案 0 :(得分:2)
# print random unicode character from the Basic Multilingual Plane (BMP)
import random
print(chr(random.randint(0,65536)))
摘自Python 3 chr()
文档:
chr(i)
返回表示Unicode代码点为整数i的字符的字符串。例如,chr(97)返回字符串“ a”,而chr(8364)返回字符串“€”。这是ord()的反函数。
该参数的有效范围是从0到1,114,111(以16为底的0x10FFFF)。如果我超出该范围,将引发ValueError。
# print unicode character using select hex chars
import random
chars = '0123456789ABCDEF'
# create random 4 character string from the characters in chars
hexvalue = ''.join(random.choice(chars) for _ in range(4))
# convert string representation of hex value to int,
# then convert to unicode character for printing
print(chr(int(hexvalue, 16)))
此函数使用str.isprintable()
方法仅返回可打印的字符。如果要生成一系列字符,这很有用。还包括用于字符范围的选项。
import random
def randomPrintableUnicode(charRange = None):
if charRange is None:
charRange = (0,1114112)
while True:
i = random.randint(*charRange)
c = chr(i)
if c.isprintable():
return c
# should add another conditional break
# to avoid infinite loop
# Print random unicode character
print(randomPrintableUnicode())
# Print random unicode character from the BMP
print(randomPrintableUnicode(charRange = (0,65536)))
# Print random string of 20 characters
# from the Cyrillic alphabet
cyrillicRange = (int('0410',16),int('0450',16))
print(
''.join(
[
randomPrintableUnicode(charRange = cyrillicRange)
for _ in range(20)
]
)
)
答案 1 :(得分:-1)
您可以进行一个永久循环,该循环将生成随机的unicode字符及其ID和数字。而且,它永远不会崩溃。 (除非您疯狂地做些事情。)删除“ while True:”以停止永久循环,并删除“ sleep(1)”以停止等待时间。
from random import randint
from time import sleep
while True:
try:
sleep(1)
a=(randint(1,65663))
print('Character:')
print(chr(a))
print('ID:' + str(hex(a)))
print('Number:' + str(a) + '\n\n\n\n\n\n\n\n\n\n\n\n\n')
except UnicodeEncodeError:
print('Character is not possible to print. Moving on.')