有一些XML解析文本看起来像这样:
06:00 Vesti<br>07:15 Something Else<br>09:10 Movie<a href="..."> ... <br>15:45 Something..
并且有很多......
嗯,我做到了:
String mim =ses.replaceAll("(?s)\\<.*?\\>", " \n");
没有其他方法可以很好地显示文字。 现在,经过几次放映,有一段时间,我需要将相同的文本分成单独的字符串,如下所示:
06:00 Vesti
......或
07:15 Something Else
我尝试过类似的东西,但它不起作用:
char[] rast = description.toCharArray();
int brojac = 0;
for(int q=0; q<description.length(); q++){
if(rast[q]=='\\' && rast[q+1]=='n' ) brojac++;
}
String[] niz = new String[brojac];
int bf1=0;
int bf2=0;
int bf3=0;
int oo=0;
for(int q=0; q<description.length(); q++){
if(rast[q]=='\\'&& rast[q+1]=='n'){
bf3=bf1;
bf1=q;
String lol = description.substring(bf3, bf1);
niz[oo]=lol;
oo++;
}
}
我知道在description.substring(bf3,bf1)中没有设置它们应该是,但我认为这样:
if(rast[q]=='\\' && rast[q+1]=='n)
不能那样工作..还有其他解决办法吗?
请注意。没有其他方法可以获得该资源。 ,必须通过这个。
答案 0 :(得分:1)
致电Html.fromHtml(String)
会将<br>
正确翻译为\ n。
String html = "06:00 Vesti<br>07:15 Something Else<br>09:10 Movie<a href=\"...\"> ... <br>15:45 Something..";
String str = Html.fromHtml(html).toString();
String[] arr = str.split("\n");
然后,只需将它拆分为行 - 不需要regexp(在第一种情况下你不应该使用它来解析HTML)。
编辑:将所有内容变为一堆Date
s
// Used to find the HH:mm, in case the input is wonky
Pattern p = Pattern.compile("([0-2][0-9]:[0-5][0-9])");
SimpleDateFormat fmt = new SimpleDateFormat("HH:mm");
SortedMap<Date, String> programs = new TreeMap<Date, String>();
for (String row : arr) {
Matcher m = p.matcher(row);
if (m.find()) {
// We found a time in this row
ParsePosition pp = new ParsePosition(m.start(0));
Date when = fmt.parse(row, pp);
String title = row.substring(pp.getIndex()).trim();
programs.put(when, title);
}
}
// Now programs contain the sorted list of programs. Unfortunately, since
// SimpleDateFormat is stupid, they're all placed back in 1970 :-D.
// This would give you an ordered printout of all programs *AFTER* 08:00
Date filter = fmt.parse("08:00");
SortedMap<Date, String> after0800 = programs.tailMap(filter);
// Since this is a SortedMap, after0800.values() will return the program names in order.
// You can also iterate over each entry like so:
for (Map.Entry<Date,String> program : after0800.entrySet()) {
// You can use the SimpleDateFormat to pretty-print the HH:mm again.
System.out.println("When:" + fmt.format(program.getKey()));
System.out.println("Title:" + program.getValue());
}
答案 1 :(得分:0)
使用正则表达式:
List<String> results = new ArrayList<String>();
Pattern pattern = Pattern.compile("(\d+:\d+ \w+)<?");
Matcher matcher = pattern.matcher("06:00 Vesti<br>07:15 Something Else<br>09:10 Movie<a href="..."> ... <br>15:45 Something..");
while(matcher.find()) {
results.add(matcher.group(0));
}
results
最终将作为字符串列表:
results = List[
"06:00 Vesti",
"07:15 Something Else",
"09:10 Movie",
"15:45 Something.."]
请参阅Rexgex Java Tutorial,了解javas正则表达式库的工作原理。