如何在ArrayList中保存的字符串之间选择文本

时间:2016-04-15 18:08:05

标签: java apache-poi apache-tika

无法解决这个问题。我正在使用apache POI从.doc中选择粗体文本。我已将此添加到数组列表中。我想选择位于arrayList中连续字符串之间的.doc中的文本,然后单独存储每个选定的部分。

换句话说,我有这个:

MyBold title
Bla
bla
fsfs
bn
whtrh

More bold title

gfgdgdfs
dsgfd
gfdg

Another title of some kind

结果arrayList给了我这个:

MyBold title
More bold title
Another title of some kind

我希望将其作为单独的字符串对象 第一个对象:

MyBold title
    Bla
    bla
    fsfs
    bn
    whtrh

第二个对象:

More bold title

    gfgdgdfs
    dsgfd
    gfdg

第三个对象:

Another title of some kind

到目前为止我的代码:

import java.io.FileInputStream;
import java.io.IOException;
import java.io.InputStream;
import java.util.ArrayList;
import org.apache.poi.hwpf.HWPFDocument;
import org.apache.poi.hwpf.extractor.WordExtractor;
import org.apache.poi.hwpf.usermodel.Paragraph;
import org.apache.poi.hwpf.usermodel.Range;
import org.apache.poi.poifs.filesystem.POIFSFileSystem;
import org.apache.poi.wp.usermodel.CharacterRun;

public class Impedance {
    String REGEX = "[A-Z]+";
    public Impedance()  {
    }

        // TODO Auto-generated constructor stub
         public static void Update() throws IOException {
    try{
    String fileName = "/Users/IMPEDANCE.doc";
    InputStream fis = new FileInputStream(fileName);  
    POIFSFileSystem fs = new POIFSFileSystem(fis);  
    HWPFDocument doc = new HWPFDocument(fs);  

    Range range = doc.getRange();
    WordExtractor we = new WordExtractor(doc);
    Paragraph r =range.getParagraph(0);
    ArrayList<String> bold1 = new ArrayList<String>();
    for(int i = 0; i<range.numCharacterRuns(); i++)
    {


            CharacterRun cr = range.getCharacterRun(i);
            if(cr.isBold())
            {
            System.out.println(cr.text());
            bold1.add(cr.text());
            }

    }



    for (String bold : bold1) {
    //How to iterate through array list and return the string section between consecutive parts of the arraylist
    }
 }
}

1 个答案:

答案 0 :(得分:0)

你可以使用一个Map,其中键是标题,value是一个字符串列表。

public boolean add(E e) {
    ensureCapacity(size + 1);  // Increments modCount!!
    elementData[size++] = e;
    return true;
}