Question

我有一个包含很多不同网址的txt文件。我想解析列表并跳过一些网址以获得最终的清单。请参阅以下列表的一部分：

http://www.example.com/example1/
http://www.example.com/example2/
http://www.example.com/example3/
http://www.example.com/example4/
http://www.example.com/example.js
http://www.example.com/example.css
http://www.example.com/example1.js?v=123
http://www.example.com/{path}
http://www.example.com/feed/
http://www.example.com/?p=66

我想跳过所有网址，例如 .js 或 .css 或 {路径} 或 / feed / 或？p = 66 并再次将所有内容输出到txt文件中。我想用PHP做到这一点。有什么建议吗？

Answer 1

<?php 

  $list = "http://www.example.com/example1/
http://www.example.com/example2/
http://www.example.com/example3/
http://www.example.com/example4/
http://www.example.com/example.js
http://www.example.com/example.css
http://www.example.com/example1.js?v=123
http://www.example.com/{path}
http://www.example.com/feed/
http://www.example.com/?p=66";

  $arr = preg_split("/[\r\n]+/",$list);

  // check our input array
  print_r($arr);

  $map = array();
  foreach($arr as $v){
    if(!preg_match("/({path}|\.(js|css)|\?p=\d+|\/feed\/)$/",$v)){
      $map[] = $v;
    }
  };

  // check our output array
  print_r($map);

?>

这假设您希望匹配不以{path}或.css或.js或?p=##（其中＃是数字）或{{1}结尾的网址}}。这就是为什么/feed/仍然匹配的原因。要使其匹配字符串中的任何位置，而不是仅在结尾处，请从正则表达式的末尾（在单词/example1.js?v=123之后）删除$。

我的控制台输出：

feed

如何解析url列表并跳过不需要的项目（PHP）

1 个答案:

我的控制台输出：