我在php中成功生成了一个ms-word文档。 我在这个文档中插入了我从富文本编辑器(tinyMCE)获得的html代码。但是我得到了一些额外的意外字符,所以我猜这是一个编码问题。
这是php代码:
$content = $_GET['content'];
$filename = './cases/mydocument.htm';
$output = "<html xmlns:o='urn:schemas-microsoft-com:office:office' xmlns:w='urn:schemas-microsoft-com:office:word' xmlns='http://www.w3.org/TR/REC-html40'>";
$output .= "<head><title>Mon document</title>";
$output .= "<!--[if gte mso 9]>";
$output .= "<xml><w:WordDocument><w:View>Print</w:View><w:Zoom>100</w:Zoom><w:DoNotOptimizeForBrowser/></w:WordDocument></xml>";
$output .= "<![endif]-->";
$output .= "<link rel=File-List href=\"mydocument_files/filelist.xml\">";
$output .= "<style><!-- ";
$output .= "@page";
$output .= "{";
$output .= " size:21cm 29.7cmt; /* A4 */";
$output .= " margin:1cm 1cm 1cm 1cm; /* Margins: 2.5 cm on each side */";
$output .= " mso-page-orientation: portrait; ";
$output .= " mso-header: url(\"mydocument_files/headerfooter.htm\") h1;";
$output .= " mso-footer: url(\"mydocument_files/headerfooter.htm\") f1; ";
$output .= "}";
$output .= "@page Section1 { }";
$output .= "div.Section1 { page:Section1; }";
$output .= "p.MsoHeader, p.MsoFooter { border: none; }";
$output .= "--></style>";
$output .= "</head>";
$output .= "<body>";
$output .= "<div class=Section1>";
$output .= $content;
$output .= "</div>";
$output .= "</body>";
$output .= "</html>";
file_put_contents($filename, $output);
class mime10class
{
private $data;
const boundary='----=_NextPart_ERTUP.EFETZ.FTYIIBVZR.EYUUREZ';
function __construct() { $this->data="MIME-Version: 1.0\nContent-Type: multipart/related; boundary=\"".self::boundary."\"\n\n"; }
public function addFile($filepath,$contenttype,$data)
{
$this->data = $this->data.'--'.self::boundary."\nContent-Location: http://www.monsite.com/dev1/cases/".preg_replace('!\\\!', '/', $filepath)."\nContent-Transfer-Encoding: base64\nContent-Type: ".$contenttype."\n\n";
$this->data = $this->data.base64_encode($data)."\n\n";
}
public function getFile() { return $this->data.'--'.self::boundary.'--'; }
}
$doc = New mime10class();
$doc->addFile('mydocument.htm','text/html; charset="utf-8"',$output);
$output_encoded = $doc->getFile();
$filename = './cases/mydocument.doc';
file_put_contents($filename, $output_encoded);
对于前者,我的编辑返回:
<p>test <em>italique</em></p>
我已经检查过,$ content包含了这个html部分。 但是当我执行第一个file_put_content时,我得到了这个:
测试italique
我试图在file_put_contents之前在$ output上应用utf8_encode,然后我得到它甚至最差:
测试Â,Ãitalique
我的记事本++在没有BOM的utf8中配置
有什么想法吗?
答案 0 :(得分:0)
每当使用tinymce时,请尝试申请,
stripslashes($_POST['content']);