Question

我正在尝试使用简单HTML DOM（http://simplehtmldom.sourceforge.net）解析中文网站，但面临的问题是所有解析的中文字符都成为无法识别的符号。

示例：“星洲网”变成了“æ〜Ÿæ'²ç¶²”

如何使用Simple HTML DOM解析UTF-8字符？或者我在编码中做错了什么？

以下是我的PHP编码：

<?php
require_once ("simple_html_dom.php");

$html = file_get_html("http://www.sinchew-i.com");
print $html->plaintext;
?>

Answer 1

header('Content-Type: text/html; charset=utf-8');