使用PHP删除以©(版权)符号开头的行

时间:2015-07-27 23:42:08

标签: php regex preg-replace

背景:我们正在合并一些文档的页面,因此它们看起来像一个不错的长页面,而不是分成几百个。为此,我们需要从每页底部删除页码,HR标签,版权声明,然后手动将版权声明添加到最终页面。我们找到了一个识别页脚的简单模式,并在下面进行了概述。

要清理页脚,我正在尝试删除br和版权符号之间的所有文本以及关闭标记

In the beginning the universe was created.
<br/>© 2010 Some message here<br/>
<hr/>
<a name=3></a>
This has made a lot of people very angry and been widely regarded as a bad move.

预期结果:

In the beginning the universe was created.
This has made a lot of people very angry and been widely regarded as a bad move.

我发现的最有希望的代码是:PHP function to delete all between certain character(s) in string

但是当我尝试使用它时,我没有得到匹配。

    $contents = delete_all_between('<br/>©', '</a>', $contents);
    $contents = delete_all_between('<br/>&#169;', '</a>', $contents);

我尝试过使用©符号以及&amp; #169;以及其他一些变化,但我没有想法。

我怀疑这很简单,希望有人可以让我摆脱困境。

2 个答案:

答案 0 :(得分:1)

这可以使用PHP中的正则表达式完成。这是一个例子:

@protocol DidSomething
  -(void)userDidSomething:(NSString*)something
@end


ClassA <DidSomething>
  -(void)userDidSomething:(NSString*)something
  {
    NSLog(@"The user did something %@",something);
  }


ClassB <DidSomething>
  -(void)userDidSomething:(NSString*)something
  {
    [self.delegate userDidSomething:something];
  }


ClassC <DidSomething>
  -(void)thatWasInteresting
  {
    [self.delegate userDidSomething:@"Cool"];
  }

简单地说,上面的代码将替换以$text = "All of your stuff. @This will be deleted"; echo preg_replace("/(@.+)(<)/", "", $text); 开头的所有代码和结束标记。

答案 1 :(得分:1)

检查您使用的文档的编码。创建版权符号的更常见方法是&copy;  Copyright encodings