今天,面对这个问题,使用了JSoup库。该网站包含所需的数据,但有相同的div没有类,但有'样式'。需要得到宽度的数量。
<div style="height: 12px; width: 196px; background-color: #5C1; float: left; border-right: 1px solid #111;"></div>
有两种情况:1。文本出现,但有一个div; 2.文本显示在所有div上,但会产生大量重复,并被视为新行。
try{
Document doc = Jsoup.connect("http://www.lolking.net/summoner/euw/34201718").get(); //Random player
Elements elem = doc.getElementsByTag("div");
Scanner scn = new Scanner(elem.toString());
while(scn.hasNext()){
String res = scn.nextLine();
if(res.contains("<div style=\"height: 12px; width: ") && res.contains("px; background-color: #5C1; float: left; border-right: 1px solid #111;\"></div>")){
if(sd == 0){ //So flooding was not. I understand that this can be a problem, but if you remove, there will be a flood of other numbers (duplicates) that cannot be removed, because each line of the program perceives as one
String t1 = res.replace(" <div style=\"height: 12px; width: ", "");
String t2 = t1.replace("px; background-color: #5C1; float: left; border-right: 1px solid #111;\"></div> ", "");
System.out.println(t2); // Get 192... and all
sd += 1;
}
}
}
}
catch(IOException e){
e.printStackTrace();
}
整天都不能想出很多解决方案,但总是遇到这两种情况。最近开始学习Java。感谢。
答案 0 :(得分:1)
如果我理解正确,您需要width
style
标记内的所有div
。
您当然可以使用Jsoup
来执行此操作:
已编辑的代码:
Set<String> known=new HashSet<String>();
known.add("height: 12px");
known.add("background-color: #5C1");
known.add("float: left");
known.add("border-right: 1px solid #111");
Document doc = Jsoup.connect("http://www.lolking.net/summoner/euw/34201718").get();
Elements elements=doc.select("div");
for(Element e : elements){
if(e.hasAttr("style")){
Set<String> splitted=new HashSet<String>();
for(String s : Arrays.asList(e.attr("style").split(";"))){
splitted.add(s.trim());
}
if(splitted.containsAll(known)){
splitted.removeAll(known);
for(String s: splitted){
if(s.startsWith("width:")){
System.out.println(s);
}
}
}
}
}