Question

例如我的网站是mysite.com。以下是该网站的来源：

<html>
<head>
<title>site</title>
//here is many javascript and css codes
</head>
<body>
<div>
<table border="1">
<tr>
  <td><a href="somthing.html">Here is a text</td>
  <td><img src="image.gif" alt="this is image"/></td>
</tr>
</table>
</div>
</body>
</html>

如何使用php只获取没有所有标签的文本和图像（javascript代码，链接，表格等）？我只想得到＆＃34;这里是一个文字＆＃34;和＆＃34; image.gif＆＃34;。

Answer 1

如果文件在互联网上，请使用PHP cURL，否则如果本地计算机上的文件可以使用file_get_contents()功能。

要删除额外的标签，您可以使用以下代码：

$contents - file_get_contents('file.html');
$contents = strip_tags( $contents, '<img>' ); //other than <img> you can specify more tags also

或者您也可以使用DOM方法。

如何使用PHP获取网站内容

1 个答案: