我写不出正确的话。我怎么能解决这个问题?我想要这样的东西:
<text><sentence>
<word>a</word>
<word>had</word>
<word>lamb</word>
<word>little</word>
<word>Mary</word>
</sentence>
<sentence>
<word>Aesop</word>
<word>and</word>
<word>called</word>
<word>came</word>
<word>for</word>
<word>Peter</word>
<word>the</word>
<word>wolf</word>
</sentence>
<sentence>
<word>Cinderella</word>
<word>likes</word>
<word>shoes</word>
</sentence>
但我有这样的事情:
<text>
<sentence>
<word>a</word>
<word>had</word>
<word>lamb</word>
<word>little</word>
<word>Mary</word>
</sentence>
</text>
示例文字
“玛丽有一只小羊羔。” 彼得叫狼,而伊索来了。灰姑娘喜欢鞋......“我的班级
我的班级
public class StaxWriteXmlTest {
/**
* @param args
* @throws FileNotFoundException
* @throws XMLStreamException
*/
public static void main(String[] args) throws FileNotFoundException,
XMLStreamException {
String[] word = initItems();
// xml event writer with output stream
// XMLOutputFactory xmlOutFactory = XMLOutputFactory.newInstance();
// OutputStream outputStream = new FileOutputStream("D:\\word.xml");
// XMLEventWriter eventWriter = xmlOutFactory
// .createXMLEventWriter(outputStream);
XMLEventWriter eventWriter = XMLOutputFactory.newInstance()
.createXMLEventWriter(System.out);
XMLEventFactory eventFactory = XMLEventFactory.newInstance();
XMLEvent end = createNewLine(eventFactory);
XMLEvent tab = createTab(eventFactory);
// Create start tag
StartDocument startDocument = eventFactory.createStartDocument();
EndDocument endDocument = eventFactory.createEndDocument();
eventWriter.add(startDocument);
// create config open tag
eventWriter.add(end);
StartElement configStartElement = eventFactory.createStartElement("",
"", "text");
eventWriter.add(configStartElement);
eventWriter.add(end);
eventWriter.add(tab);
StartElement itemStartElement = eventFactory.createStartElement("", "",
"sentence");
eventWriter.add(itemStartElement);
eventWriter.add(end);
eventWriter.add(tab);
// add words
for (String words : word) {
eventWriter.add(tab);
createItemNode(eventFactory, eventWriter, "word", words);
eventWriter.add(tab);
}
// eventWriter.add(tab);
EndElement itemEndElement = eventFactory.createEndElement("", "",
"sentence");
eventWriter.add(itemEndElement);
eventWriter.add(end);
EndElement configEndElement = eventFactory.createEndElement("", "",
"text");
eventWriter.add(configEndElement);
eventWriter.add(end);
eventWriter.add(endDocument);
eventWriter.flush();
eventWriter.close();
}
public static void createItemNode(XMLEventFactory eventFactory,
XMLEventWriter eventWriter, String elementName, String value)
throws XMLStreamException {
XMLEvent end = eventFactory.createDTD("\n");
StartElement startElement = eventFactory.createStartElement("", "",
elementName);
eventWriter.add(startElement);
Characters characters = eventFactory.createCharacters(value);
eventWriter.add(characters);
EndElement endElement = eventFactory.createEndElement("", "",
elementName);
eventWriter.add(endElement);
eventWriter.add(end);
}
public static XMLEvent createTab(XMLEventFactory eventFactory) {
return eventFactory.createDTD("\t");
}
public static XMLEvent createNewLine(XMLEventFactory eventFactory) {
return eventFactory.createDTD("\n");
}
public static String[] initItems() {
FileReader fr = null;
try {
fr = new FileReader("text.txt");
} catch (FileNotFoundException e1) {
e1.printStackTrace();
}
BufferedReader inputText = new BufferedReader(fr);
String text = "", newText = "";
String allTogether = "";
String[] nexSplit = {};
try {
while ((text = inputText.readLine()) != null) {
newText += text.replaceAll("\\s+", " ").replaceAll(" ,", ",")
.replaceAll(" \\.", ".").replaceAll("\\..", ".");
allTogether = newText.replaceAll("\\s+", " ");
}
String[] splitText = allTogether.split("[.]");
for (int i = 0; i < splitText.length; i++) {
nexSplit = splitText[i].split("[ \t]");
Arrays.sort(nexSplit, String.CASE_INSENSITIVE_ORDER);
return nexSplit;
}
} catch (IOException e) {
e.printStackTrace();
}
return nexSplit;
}
}
答案 0 :(得分:0)
问题出在initItems
,由于语句return nexSplit;
(在内循环中),在第一句之后返回。您必须将一个句子的已排序单词收集到一个List中,然后返回该列表。我重复了方法initItems中需要更改的行:
public static List<String[]> initItems() { // RETURN A LIST
List<String[]> sents = new ArrayList<>(); // declare new List
// ...
for (int i = 0; i < splitText.length; i++) {
nexSplit = splitText[i].split("[ \t]");
Arrays.sort(nexSplit, String.CASE_INSENSITIVE_ORDER);
sents.add( nexSplit ); // APPEND WORDS OF ANOTHER SENTENCE
}
return sents; // RETURN THE LIST OF WORDS-OF-A-SENTENCE
}
当然,主程序现在必须处理List<String[]>
。