我试图回答有关来自HTTP服务器的流音频的问题,然后使用PyGame进行播放。我的代码大部分已经完成,但是在urllib.HTTPResponse对象上PyGame music functions试图seek()
时遇到了错误。
根据urlib文档,urllib.HTTPResponse对象(自v3.5起)为io.BufferedIOBase
。我希望这会使流seek()成为可能,但事实并非如此。
是否有一种包装io.BufferedIOBase
的方法,使其足够聪明以缓冲足够的数据来处理搜索操作?
import pygame
import urllib.request
import io
# Window size
WINDOW_WIDTH = 400
WINDOW_HEIGHT = 400
# background colour
SKY_BLUE = (161, 255, 254)
### Begin the streaming of a file
### Return the urlib.HTTPResponse, a file-like-object
def openURL( url ):
result = None
try:
http_response = urllib.request.urlopen( url )
print( "streamHTTP() - Fetching URL [%s]" % ( http_response.geturl() ) )
print( "streamHTTP() - Response Status [%d] / [%s]" % ( http_response.status, http_response.reason ) )
result = http_response
except:
print( "streamHTTP() - Error Fetching URL [%s]" % ( url ) )
return result
### MAIN
pygame.init()
window = pygame.display.set_mode( ( WINDOW_WIDTH, WINDOW_HEIGHT ) )
pygame.display.set_caption("Music Streamer")
clock = pygame.time.Clock()
done = False
while not done:
# Handle user-input
for event in pygame.event.get():
if ( event.type == pygame.QUIT ):
done = True
# Keys
keys = pygame.key.get_pressed()
if ( keys[pygame.K_UP] ):
if ( pygame.mixer.music.get_busy() ):
print("busy")
else:
print("play")
remote_music = openURL( 'http://127.0.0.1/example.wav' )
if ( remote_music != None and remote_music.status == 200 ):
pygame.mixer.music.load( io.BufferedReader( remote_music ) )
pygame.mixer.music.play()
# Re-draw the screen
window.fill( SKY_BLUE )
# Update the window, but not more than 60fps
pygame.display.flip()
clock.tick_busy_loop( 60 )
pygame.quit()
运行此代码并按下 Up 时,它将失败,并显示以下错误:
streamHTTP() - Fetching URL [http://127.0.0.1/example.wav]
streamHTTP() - Response Status [200] / [OK]
io.UnsupportedOperation: seek
io.UnsupportedOperation: File or stream is not seekable.
io.UnsupportedOperation: seek
io.UnsupportedOperation: File or stream is not seekable.
Traceback (most recent call last):
File "./sound_stream.py", line 57, in <module>
pygame.mixer.music.load( io.BufferedReader( remote_music ) )
pygame.error: Unknown WAVE format
我还尝试重新打开io流,以及对同一事物的各种其他重新实现。
答案 0 :(得分:5)
根据urlib文档,
urllib.HTTPResponse
对象(自v3.5起)为io.BufferedIOBase
。我希望这会使流seek()成为可能,但事实并非如此。
是的。 io.BufferedIOBase
interface不保证I / O对象是可搜索的。对于HTTPResponse
对象,IOBase.seekable()
返回False
:
>>> import urllib.request
>>> response = urllib.request.urlopen("http://httpbin.org/get")
>>> response
<http.client.HTTPResponse object at 0x110870ca0>
>>> response.seekable()
False
这是因为BufferedIOBase
提供的HTTPResponse
实现包装了一个套接字对象和sockets are not seekable either。
您不能将BufferedIOBase
对象包装在BufferedReader
对象中并添加寻求支持。 Buffered*
包装对象只能包装RawIOBase
类型,并且它们依靠包装的对象来提供寻求支持。您将不得不在原始I / O级别上模拟搜索,请参见下文。
您仍然可以在更高级别上提供相同的功能,但要考虑到寻求远程数据涉及更多;这不是简单的更改一个简单的OS变量,该变量代表磁盘上文件的位置。对于较大的远程文件数据,查找而不在本地将整个文件备份到磁盘可能与使用HTTP range requests和本地(在内存或磁盘上)缓冲区以平衡声音播放性能并最小化本地数据存储一样复杂。在广泛的用例中正确执行此操作可能会很费力,因此肯定不是Python标准库的一部分。
如果基于HTTP的声音文件足够小(最多几个MB),则只需将整个响应读取到内存io.BytesIO()
文件对象中即可。我真的不认为要比这更复杂,因为当您有足够的数据可以值得追求的那一刻时,您的文件就足够大到占用太多内存!
因此,如果您的声音文件较小(不超过几MB),那么绰绰有余:
from io import BytesIO
import urllib.error
import urllib.request
def open_url(url):
try:
http_response = urllib.request.urlopen(url)
print(f"streamHTTP() - Fetching URL [{http_response.geturl()}]")
print(f"streamHTTP() - Response Status [{http_response.status}] / [{http_response.reason}]")
except urllib.error.URLError:
print("streamHTTP() - Error Fetching URL [{url}]")
return
if http_response.status != 200:
print("streamHTTP() - Error Fetching URL [{url}]")
return
return BytesIO(http_response.read())
这不需要编写包装器对象,并且由于BytesIO
是本机实现,因此,一旦完全复制数据,对数据的访问将比任何Python代码包装器所能提供的快。
请注意,这将返回一个BytesIO
文件对象,因此您不再需要测试响应状态:
remote_music = open_url('http://127.0.0.1/example.wav')
if remote_music is not None:
pygame.mixer.music.load(remote_music)
pygame.mixer.music.play()
一旦超出几兆字节,您可以尝试预加载数据到本地文件对象中。您可以使用线程使shutil.copyfileobj()
在后台将大部分数据复制到该文件中,并在仅加载初始数据量后将该文件提供给PyGame,从而使操作更加复杂。
通过使用实际文件对象,您实际上可以在此处帮助提高性能,因为PyGame会尽量减少SDL混合器和文件数据之间的干扰。如果磁盘上有一个带有文件号(流的操作系统级别标识符,SDL混合器库可以利用的东西)的实际文件,则PyGame将直接在该文件上运行,从而最大程度地减少对GIL的阻止(转会帮助您提高游戏的Python部分的性能!)。而且,如果您传入文件名(只是一个字符串),则PyGame会完全摆脱麻烦,并将所有文件操作留给SDL库。
这是一个实现;这应该在正常的Python解释器退出时自动清理下载的文件。它返回一个文件名供PyGame处理,并在最初的几个KB缓冲后在线程中完成下载数据。它将避免多次加载相同的URL,并且使它成为线程安全的:
import shutil
import urllib.error
import urllib.request
from tempfile import NamedTemporaryFile
from threading import Lock, Thread
INITIAL_BUFFER = 1024 * 8 # 8kb initial file read to start URL-backed files
_url_files_lock = Lock()
# stores open NamedTemporaryFile objects, keeping them 'alive'
# removing entries from here causes the file data to be deleted.
_url_files = {}
def open_url(url):
with _url_files_lock:
if url in _url_files:
return _url_files[url].name
try:
http_response = urllib.request.urlopen(url)
print(f"streamHTTP() - Fetching URL [{http_response.geturl()}]")
print(f"streamHTTP() - Response Status [{http_response.status}] / [{http_response.reason}]")
except urllib.error.URLError:
print("streamHTTP() - Error Fetching URL [{url}]")
return
if http_response.status != 200:
print("streamHTTP() - Error Fetching URL [{url}]")
return
fileobj = NamedTemporaryFile()
content_length = http_response.getheader("Content-Length")
if content_length is not None:
try:
content_length = int(content_length)
except ValueError:
content_length = None
if content_length:
# create sparse file of full length
fileobj.seek(content_length - 1)
fileobj.write(b"\0")
fileobj.seek(0)
fileobj.write(http_response.read(INITIAL_BUFFER))
with _url_files_lock:
if url in _url_files:
# another thread raced us to this point, we lost, return their
# result after cleaning up here
fileobj.close()
http_response.close()
return _url_files[url].name
# store the file object for this URL; this keeps the file
# open and so readable if you have the filename.
_url_files[url] = fileobj
def copy_response_remainder():
# copies file data from response to disk, for all data past INITIAL_BUFFER
with http_response:
shutil.copyfileobj(http_response, fileobj)
t = Thread(daemon=True, target=copy_response_remainder)
t.start()
return fileobj.name
与BytesIO()
解决方案一样,以上代码返回None
或准备传递给pygame.mixer.music.load()
的值。
如果您尝试立即在声音文件中设置高级播放位置,则上面的 可能不起作用,因为以后的数据可能尚未复制到文件中。这是一个权衡。
如果您需要对远程URL的全面寻求支持,并且不想为它们使用磁盘上的空间,又不想担心它们的大小,则无需重新发明HTTP作为可搜索文件轮。您可以使用提供相同功能的现有项目。我发现两个提供基于io.BufferedIOBase
的实现:
两者都使用HTTP Range请求来实现寻求支持。只需使用httpio.open(URL)
或smart_open.open(URL)
并将其直接传递给pygame.mixer.music.load()
;如果无法打开该URL,则可以通过处理IOError
异常来捕获该URL:
from smart_open import open as url_open # or from httpio import open
try:
remote_music = url_open('http://127.0.0.1/example.wav')
except IOError:
pass
else:
pygame.mixer.music.load(remote_music)
pygame.mixer.music.play()
smart_open
使用内存中的缓冲区来满足固定大小的读取,但是会为每个调用创建一个新的HTTP Range请求,以寻求更改当前文件的位置,因此性能可能会有所不同。由于SDL混合器会对音频文件执行一些搜索以确定它们的类型,所以我希望它会慢一些。
httpio
可以缓冲数据块,因此可能会更好地处理查找,但是从源代码的简要介绍中可以看出,在实际设置缓冲区大小时,缓存的块不会再从内存中退出,因此您将结束最终将整个文件存储在内存中。
最后,由于我找不到HTTP范围支持的效率的实现,因此我编写了自己的实现。以下内容实现了io.RawIOBase
接口,因此您可以将对象包装在io.BufferedIOReader()
中,然后将缓存委托给在寻找时将被正确管理的缓存缓冲区:
import io
from copy import deepcopy
from functools import wraps
from typing import cast, overload, Callable, Optional, Tuple, TypeVar, Union
from urllib.request import urlopen, Request
T = TypeVar("T")
@overload
def _check_closed(_f: T) -> T: ...
@overload
def _check_closed(*, connect: bool, default: Union[bytes, int]) -> Callable[[T], T]: ...
def _check_closed(
_f: Optional[T] = None,
*,
connect: bool = False,
default: Optional[Union[bytes, int]] = None,
) -> Union[T, Callable[[T], T]]:
def decorator(f: T) -> T:
@wraps(cast(Callable, f))
def wrapper(self, *args, **kwargs):
if self.closed:
raise ValueError("I/O operation on closed file.")
if connect and self._fp is None or self._fp.closed:
self._connect()
if self._fp is None:
# outside the seekable range, exit early
return default
try:
return f(self, *args, **kwargs)
except Exception:
self.close()
raise
finally:
if self._range_end and self._pos >= self._range_end:
self._fp.close()
del self._fp
return cast(T, wrapper)
if _f is not None:
return decorator(_f)
return decorator
def _parse_content_range(
content_range: str
) -> Tuple[Optional[int], Optional[int], Optional[int]]:
"""Parse a Content-Range header into a (start, end, length) tuple"""
units, *range_spec = content_range.split(None, 1)
if units != "bytes" or not range_spec:
return (None, None, None)
start_end, _, size = range_spec[0].partition("/")
try:
length: Optional[int] = int(size)
except ValueError:
length = None
start_val, has_start_end, end_val = start_end.partition("-")
start = end = None
if has_start_end:
try:
start, end = int(start_val), int(end_val)
except ValueError:
pass
return (start, end, length)
class HTTPRawIO(io.RawIOBase):
"""Wrap a HTTP socket to handle seeking via HTTP Range"""
url: str
closed: bool = False
_pos: int = 0
_size: Optional[int] = None
_range_end: Optional[int] = None
_fp: Optional[io.RawIOBase] = None
def __init__(self, url_or_request: Union[Request, str]) -> None:
if isinstance(url_or_request, str):
self._request = Request(url_or_request)
else:
# copy request objects to avoid sharing state
self._request = deepcopy(url_or_request)
self.url = self._request.full_url
self._connect(initial=True)
def readable(self) -> bool:
return True
def seekable(self) -> bool:
return True
def close(self) -> None:
if self.closed:
return
if self._fp:
self._fp.close()
del self._fp
self.closed = True
@_check_closed
def tell(self) -> int:
return self._pos
def _connect(self, initial: bool = False) -> None:
if self._fp is not None:
self._fp.close()
if self._size is not None and self._pos >= self._size:
# can't read past the end
return
request = self._request
request.add_unredirected_header("Range", f"bytes={self._pos}-")
response = urlopen(request)
self.url = response.geturl() # could have been redirected
if response.status not in (200, 206):
raise OSError(
f"Failed to open {self.url}: "
f"{response.status} ({response.reason})"
)
if initial:
# verify that the server supports range requests. Capture the
# content length if available
if response.getheader("Accept-Ranges") != "bytes":
raise OSError(
f"Resource doesn't support range requests: {self.url}"
)
try:
length = int(response.getheader("Content-Length", ""))
if length >= 0:
self._size = length
except ValueError:
pass
# validate the range we are being served
start, end, length = _parse_content_range(
response.getheader("Content-Range", "")
)
if self._size is None:
self._size = length
if (start is not None and start != self._pos) or (
length is not None and length != self._size
):
# non-sensical range response
raise OSError(
f"Resource at {self.url} served invalid range: pos is "
f"{self._pos}, range {start}-{end}/{length}"
)
if self._size and end is not None and end + 1 < self._size:
# incomplete range, not reaching all the way to the end
self._range_end = end
else:
self._range_end = None
fp = cast(io.BufferedIOBase, response.fp) # typeshed doesn't name fp
self._fp = fp.detach() # assume responsibility for the raw socket IO
@_check_closed
def seek(self, offset: int, whence: int = io.SEEK_SET) -> int:
relative_to = {
io.SEEK_SET: 0,
io.SEEK_CUR: self._pos,
io.SEEK_END: self._size,
}.get(whence)
if relative_to is None:
if whence == io.SEEK_END:
raise IOError(
f"Can't seek from end on unsized resource {self.url}"
)
raise ValueError(f"whence value {whence} unsupported")
if -offset > relative_to: # can't seek to a point before the start
raise OSError(22, "Invalid argument")
self._pos = relative_to + offset
# there is no point in optimising an existing connection
# by reading from it if seeking forward below some threshold.
# Use a BufferedIOReader to avoid seeking by small amounts or by 0
if self._fp:
self._fp.close()
del self._fp
return self._pos
# all read* methods delegate to the SocketIO object (itself a RawIO
# implementation).
@_check_closed(connect=True, default=b"")
def read(self, size: int = -1) -> Optional[bytes]:
assert self._fp is not None # show type checkers we already checked
res = self._fp.read(size)
if res is not None:
self._pos += len(res)
return res
@_check_closed(connect=True, default=b"")
def readall(self) -> bytes:
assert self._fp is not None # show type checkers we already checked
res = self._fp.readall()
self._pos += len(res)
return res
@_check_closed(connect=True, default=0)
def readinto(self, buffer: bytearray) -> Optional[int]:
assert self._fp is not None # show type checkers we already checked
n = self._fp.readinto(buffer)
self._pos += n or 0
return n
请记住,这是一个RawIOBase
对象,您确实希望将其包装在BufferReader()
中。在open_url()
中这样做如下:
def open_url(url, *args, **kwargs):
return io.BufferedReader(HTTPRawIO(url), *args, **kwargs)
这为您提供了完全缓冲的I / O,并具有通过远程URL进行的全面支持查找,并且BufferedReader
实现将在查找时最大程度地减少重置HTTP连接。我发现将其与PyGame混合器一起使用时,仅建立了一个HTTP连接,因为所有测试都在默认的8KB缓冲区内。
答案 1 :(得分:4)
如果您可以使用requests
模块(支持流式传输)而不是urllib
,则可以使用包装器like this:
class ResponseStream(object):
def __init__(self, request_iterator):
self._bytes = BytesIO()
self._iterator = request_iterator
def _load_all(self):
self._bytes.seek(0, SEEK_END)
for chunk in self._iterator:
self._bytes.write(chunk)
def _load_until(self, goal_position):
current_position = self._bytes.seek(0, SEEK_END)
while current_position < goal_position:
try:
current_position = self._bytes.write(next(self._iterator))
except StopIteration:
break
def tell(self):
return self._bytes.tell()
def read(self, size=None):
left_off_at = self._bytes.tell()
if size is None:
self._load_all()
else:
goal_position = left_off_at + size
self._load_until(goal_position)
self._bytes.seek(left_off_at)
return self._bytes.read(size)
def seek(self, position, whence=SEEK_SET):
if whence == SEEK_END:
self._load_all()
else:
self._bytes.seek(position, whence)
那么我想你可以做这样的事情:
WINDOW_WIDTH = 400
WINDOW_HEIGHT = 400
SKY_BLUE = (161, 255, 254)
URL = 'http://localhost:8000/example.wav'
pygame.init()
window = pygame.display.set_mode( ( WINDOW_WIDTH, WINDOW_HEIGHT ) )
pygame.display.set_caption("Music Streamer")
clock = pygame.time.Clock()
done = False
font = pygame.font.SysFont(None, 32)
state = 0
def play_music():
response = requests.get(URL, stream=True)
if (response.status_code == 200):
stream = ResponseStream(response.iter_content(64))
pygame.mixer.music.load(stream)
pygame.mixer.music.play()
else:
state = 0
while not done:
for event in pygame.event.get():
if ( event.type == pygame.QUIT ):
done = True
if event.type == pygame.KEYDOWN and state == 0:
Thread(target=play_music).start()
state = 1
window.fill( SKY_BLUE )
window.blit(font.render(str(pygame.time.get_ticks()), True, (0,0,0)), (32, 32))
pygame.display.flip()
clock.tick_busy_loop( 60 )
pygame.quit()
使用Thread
开始流式传输。
我不确定这能否100%有效,但请尝试一下。