在PHP脚本中将文本转换为UTF-8

时间:2012-04-12 21:02:02

标签: php html

我正在从动态html表格中将数据导出为CSV。

但是这会导致一些问题,因为有时数据会有控制字符等。

如果可能的话,我是否需要将所有这些剥离或“友好”?

我不知道怎么做,所以有人可以帮忙吗?

这是我的剧本:

<textarea name="siteurl" rows="10" cols="50">
<?php //Check if the form has already been submitted and if this is the case, display the     submitted content. If not, display 'http://'.
echo (isset($_GET['siteurl']))?htmlspecialchars($_GET['siteurl']):"http://";?>
</textarea><br>
<input type="submit" value="Submit">
</form>
</div>
<div id="nofloat"></div>
<table class="metadata" id="metatable_1">
<?php
error_reporting(E_ALL);
//ini_set( "display_errors", 0);
function parseUrl($url){
    //Trim whitespace of the url to ensure proper checking.
    $url = trim($url);
    //Check if a protocol is specified at the beginning of the url. If it's not,    prepend 'http://'.
    if (!preg_match("~^(?:f|ht)tps?://~i", $url)) {
            $url = "http://" . $url;
    }
    //Check if '/' is present at the end of the url. If not, append '/'.
    if (substr($url, -1)!=="/"){
            $url .= "/";
    }
    //Return the processed url.
    return $url;
}
//If the form was submitted
if(isset($_GET['siteurl'])){
    //Put every new line as a new entry in the array
    $urls = explode("\n",trim($_GET["siteurl"]));
    //Iterate through urls
    foreach ($urls as $url) {
            //Parse the url to add 'http://' at the beginning or '/' at the end if not    already there, to avoid errors with the get_meta_tags function
            $url = parseUrl($url);
            //Get the meta data for each url
            $tags = get_meta_tags($url);
            //Check to see if the description tag was present and adjust output    accordingly
            $tags = NULL;
$tags = get_meta_tags($url);
if($tags)
echo "<tr><td>Description($url)</td><td>" .$tags['description']. "</td></tr>";
else 
echo "<tr><td>Description($url)</td><td>No Meta Description</td></tr>";
    }
}
?>
</table>
<script type="text/javascript">
        var exportTable1=new ExportHTMLTable('metatable_1');
    </script>
<div>
        <input type="button" onclick="exportTable1.exportToCSV()"   value="Export to CSV"/>
        <input type="button" onclick="exportTable1.exportToXML()"     value="Export to XML"/>
    </div>

</body>

2 个答案:

答案 0 :(得分:1)

我不确定我是否正确理解了这个问题,但如果你想要的只是一个UTF-8编码的CSV,你可以对你写入文件的数据使用utf8_encode()

或者,如果你想省略控制字符,你可以在使用ctype_cntrl()将它们写入文件之前检查控制字符的行...然后,使用正则表达式去掉它们,或者拒绝一起写行。

答案 1 :(得分:1)

我猜你想要的东西如下:
echo "<tr><td>Description($url)</td><td>" . utf8_encode($tags['description']) . "</td></tr>";

请指明哪些文字显示错误,是$tags['description']

以下是您可能需要的功能手册:mb_convert_encodingutf8_encode