我需要file_get_contents()
功能的帮助。
当我尝试从包含一些希伯来字符的网址中提取数据时,我从主机收到错误(无效链接)。
例如:
file_get_contents('http://domain.com/page/עברית');
对我不起作用。
答案 0 :(得分:0)
网址不能包含UTF-8字符。它们必须首先进行url编码。它们可能会在您的浏览器中显示为UTF-8字符,但这只是您的浏览器使它看起来更漂亮。
When a new URI scheme defines a component that represents textual
data consisting of characters from the Universal Character Set [UCS],
the data should first be encoded as octets according to the UTF-8
character encoding [STD63]; then only those octets that do not
correspond to characters in the unreserved set should be percent-
encoded. For example, the character A would be represented as "A",
the character LATIN CAPITAL LETTER A WITH GRAVE would be represented
as "%C3%80", and the character KATAKANA LETTER A would be represented
as "%E3%82%A2".