Question

我从一个文本文件中提取了一些文本，但是现在我只想要该文本中的一些特定单词。

我尝试过的操作是从该文本文件中读取的，并且使用关键字进行了搜索：

    FileReader fr = new 
    FileReader("D:\\PDFTOEXCEL\\Extractionfrompdf.txt");
    BufferedReader br = new BufferedReader(fr);
    String s;

    String keyword = "dba COPIEFacture ";

    while ((s = br.readLine()) != null) {
        if (s.contains(keyword)) {
            System.out.println(s);

我得到了这样的输出：dba COPIEFacture du 28/05/2018 n°10077586115Récapitulatifde表决权

但是我只想要2018年5月28日，所以请帮助我

Answer 1

您需要使用String manipulation methods。

在不看到其他输出的情况下很难知道执行此操作的最佳方法，但是您可以使用split()和indexOf()来检索日期。

还有其他可能更复杂的方法。例如，这里有StackOverflow answer关于使用正则表达式模式从字符串中检索日期。

Answer 2

这可以解决问题。

import java.io.FileNotFoundException;
import java.io.FileReader;
import java.io.IOException;

public class Main {

   public static void main(String[] args) {

    FileReader fr;
    String keyword = "dba COPIEFacture du ";
     String textToFind = "28/05/2018"; // The length usually will not 
                                       // change.You can use value 
                                       // 10(length) instead
    StringBuilder sb = new StringBuilder();
    try {
        fr = new FileReader("D:\\PDFTOEXCEL\\Extractionfrompdf.txt");

        int i;
        while ((i = fr.read()) != -1) {
            sb.append((char) i);
        }

        int start = sb.indexOf(keyword) + keyword.length();
        int end = start + textToFind.length();

        System.out.print(sb.substring(start, end));   //output: 28/05/2018

        fr.close();

    } catch (FileNotFoundException e) {
        e.printStackTrace();
    } catch (IOException e) {
        e.printStackTrace();
    }
   }
 }

Java中是否有内置函数可以从提取的数据中删除不需要的数据

2 个答案: