我需要找到特定文本中出现的所有回文。我将从一个外部文件中提取数据。我需要注意数据的内存高效处理,因此我使用了memoryview对象。但是,我需要对memoryview对象执行一些字符串操作,因此我使用了tobytes()方法。这是在不复制数据的情况下处理这些对象的正确方法吗?
from collections import Counter
palindrome = []
# read file as binary data
with open('some_text.txt', 'rb') as fr:
# create memoryview object
data = memoryview(fr.read())
# applying the tobytes() method
text = data.tobytes()
# split the sentences to words
for word in text.split():
# append to palindrome list if true
if is_palindome(word):
palindrome.append(word)
# return a Counter object with the palindromes and the number of occurences
palindrome = Counter(palindrome)
print(palindrome)
答案 0 :(得分:1)
您可以只使用 with open('some_text.txt', 'rb') as f:
b = f.read()
print(b.__class__, id(b), len(b))
data = memoryview(b)
text = data.tobytes()
print(text.__class__, id(text), len(text))
中的<class 'bytes'> 47642448 173227
<class 'bytes'> 47815728 173227
id()
可能的输出:
data.tobytes()
对于CPython,with open('some_text.txt', 'r') as f:
返回内存中对象的addres。因此,from PIL import Image
from PIL import ImageDraw
from PIL import ImageFont
from random import seed
from random import randint
import numpy as np
import os.path
#Returns the text size in terms of width and height.
def getSize(txt, font):
testImg = Image.new('RGB', (1, 1))
testDraw = ImageDraw.Draw(testImg)
return testDraw.textsize(txt, font)
text = 'lemper'
fontname = 'arial.ttf'
fontsize= 25
font = ImageFont.truetype(fontname, fontsize)
width, height = getSize(text, font)
#Creates an image with white background of constant size.
img = Image.new('RGB', (100, 100), 'white')
d = ImageDraw.Draw( img)
d.text(get_xy_coordinates(text, font), text, fill='black', font=font)
img.save("text_images/1.png")
在这种情况下返回一个副本。
考虑使用文本模式
Host