如果与用户输入字符串不匹配,如何删除文本文件中的字符串?

时间:2011-04-04 14:14:01

标签: java file-io

让我们说:我有一个用户输入“placeofjo.blogspot.com”

我的代码从这个网站中提取链接并将链接放在文本文件中。

现在文本文件包含以下内容:

http://www.twitter.com/jozefinfin/
http://www.facebook.com/jozefinfin/
http://placeofjo.blogspot.com/2008_08_01_archive.html
http://placeofjo.blogspot.com/2008_09_01_archive.html
http://placeofjo.blogspot.com/2008_10_01_archive.html
http://placeofjo.blogspot.com/2008_11_01_archive.html
http://placeofjo.blogspot.com/2008_12_01_archive.html
http://placeofjo.blogspot.com/2009_01_01_archive.html
http://placeofjo.blogspot.com/2009_02_01_archive.html
http://placeofjo.blogspot.com/2009_03_01_archive.html
http://placeofjo.blogspot.com/2009_04_01_archive.html
http://placeofjo.blogspot.com/2009_05_01_archive.html
http://placeofjo.blogspot.com/2009_06_01_archive.html
http://placeofjo.blogspot.com/2009_07_01_archive.html
http://placeofjo.blogspot.com/2009_08_01_archive.html
http://placeofjo.blogspot.com/2009_09_01_archive.html
http://placeofjo.blogspot.com/2009_10_01_archive.html
http://placeofjo.blogspot.com/2009_11_01_archive.html
http://placeofjo.blogspot.com/2010_01_01_archive.html
http://placeofjo.blogspot.com/2010_02_01_archive.html
http://placeofjo.blogspot.com/2010_04_01_archive.html
http://placeofjo.blogspot.com/2010_06_01_archive.html
http://placeofjo.blogspot.com/2010_07_01_archive.html
http://placeofjo.blogspot.com/2010_08_01_archive.html
http://placeofjo.blogspot.com/2010_10_01_archive.html
http://placeofjo.blogspot.com/2010_11_01_archive.html
http://placeofjo.blogspot.com/2011_01_01_archive.html
http://placeofjo.blogspot.com/2011_02_01_archive.html
http://placeofjo.blogspot.com/2011_03_01_archive.html
http://endlessdance.blogspot.com
http://blogskins.com/me/aaaaaa
http://weheartit.com

我想删除

http://www.twitter.com/jozefinfin/
http://www.facebook.com/jozefinfin/
http://endlessdance.blogspot.com
http://blogskins.com/me/aaaaaa
http://weheartit.com

并仅使用仅与用户输入类似的字符串。 我该怎么做呢?

文本文件的所需内容:

 http://placeofjo.blogspot.com/2008_08_01_archive.html
    http://placeofjo.blogspot.com/2008_09_01_archive.html
    http://placeofjo.blogspot.com/2008_10_01_archive.html
    "                    "
    "                    "

4 个答案:

答案 0 :(得分:1)

  1. 逐行阅读文件
  2. 检查该行是否包含用户输入
  3. 如果是,请将其写入新文件

答案 1 :(得分:0)

假设您可以同时在内存中保存整个链接列表,这可能是因为它来自网站的链接......

  1. 读入文件,拆分换行符,并生成链接列表。
  2. 过滤列表以删除任何不匹配的链接
  3. 将生成的筛选列表写回文件,替换文件的旧内容
  4. 对于过滤器中的匹配,我的想法是使用

    string.indexOf(inputToMatch) > 0 // it matches
    

答案 2 :(得分:0)

而不是构建文本文件然后过滤它。解析网页时执行过滤器。只需查找符合条件的链接,并只写出文件的良好链接。

答案 3 :(得分:0)

以下是解决此问题的正则表达方式..但是,您不应该将此解决方案与大文件一起使用..

import java.io.File;
import java.io.IOException;
import java.util.regex.Pattern;
import org.apache.commons.io.FileUtils;

public class FileReplacer {


    public static void main(String[] args) {
        replaceFileContent();
    }

    public static void replaceFileContent() {
        try {
            String allStr = FileUtils.readFileToString(new File("c:/temp/data.txt"));
            Pattern pattern =Pattern.compile("^(?!http://placeofjo\\.blogspot\\.com/.*$).+$(\\r\\n)?", Pattern.MULTILINE);
            String newAllStr = pattern.matcher(allStr).replaceAll("");
            FileUtils.writeStringToFile(new File("c:/temp/newdata.txt"), newAllStr);

        } catch (IOException e) {
            // TODO Auto-generated catch block
            throw new RuntimeException(e);
        }
    }
}