我遇到字符串拆分问题。
这是我的代码
Document doc = null;
String name = "MasterEjzz";
try {
doc = Jsoup.connect("http://oc.tc/forums").userAgent("Mozilla").get();
} catch (IOException e) {
//e.printStackTrace();
System.out.print("Seems like something went wrong! Are you connected to the internet?");
}
Elements content = doc.getElementsByClass("topic");
Elements post = content.select("div");
for (Element a : content.select("div")){
Elements href = a.select("a");
for (Element link : href){
String links = link.attr("abs:href");
String[] b = links.split("https");
System.out.print(links);
//System.out.print(b.toString());
我想用单词https拆分字符串链接,但是当我这样做时,b [0]什么也不返回,b [1]返回OutOfBounds异常。这是字符串链接返回的内容
https://oc.tc/forums/topics/523b038faf7fb046f700255dhttps://oc.tc/tjandralalahttps://oc.tc/forums/posts/523b2131af7fb0557a002882https://oc.tc/forums/posts/523b2131af7fb0557a002882https://oc.tc/forums/topics/51d1cb3cba6087dd20003a35https://oc.tc/ENSIONMANhttps://oc.tc/forums/posts/523b2117af7fb030690027f7https://oc.tc/forums/posts/523b2117af7fb030690027f7https://oc.tc/forums/topics/519c1971a87858d604004c3ahttps://oc.tc/MadCreeper77https://oc.tc/forums/posts/523b20bdaf7fb010920026a2https://oc.tc/forums/posts/523b20bdaf7fb010920026a2https://oc.tc/forums/topics/51ff8a7daf7fb0053a001fe0https://oc.tc/MadCreeper77https://oc.tc/forums/posts/523b1f8aaf7fb001bf002756https://oc.tc/forums/posts/523b1f8aaf7fb001bf002756https://oc.tc/forums/topics/5237d369af7fb0b81600038dhttps://oc.tc/zacharycraft777https://oc.tc/forums/posts/523b1f72af7fb033ab00259fhttps://oc.tc/forums/posts/523b1f72af7fb033ab00259fhttps://oc.tc/forums/topics/523a5de9af7fb062c7001cf1https://oc.tc/lonelyhornethttps://oc.tc/forums/posts/523b1de7af7fb074e0002416https://oc.tc/forums/posts/523b1de7af7fb074e0002416https://oc.tc/forums/topics/5238ff9aaf7fb001bf0000cahttps://oc.tc/lonelyhornethttps://oc.tc/forums/posts/523b1d2baf7fb02dbc0025echttps://oc.tc/forums/posts/523b1d2baf7fb02dbc0025echttps://oc.tc/forums/topics/5235f53baf7fb04c5100170ehttps://oc.tc/Kevinthedude2000https://oc.tc/forums/posts/523b1b69af7fb01783002714https://oc.tc/forums/posts/523b1b69af7fb01783002714https://oc.tc/forums/topics/522bcb94af7fb05fdc000ec8https://oc.tc/skippy369https://oc.tc/forums/posts/523b19cfaf7fb06378002384https://oc.tc/forums/posts/523b19cfaf7fb06378002384https://oc.tc/forums/topics/523aebe7af7fb0dafa0024d7https://oc.tc/MrAmazing1337https://oc.tc/forums/posts/523b1867af7fb01dde0028e6https://oc.tc/forums/posts/523b1867af7fb01dde0028e6https://oc.tc/forums/topics/523b0f8caf7fb0a8240022e2https://oc.tc/Eulenspielerhttps://oc.tc/forums/posts/523b185daf7fb0dafa002822https://oc.tc/forums/posts/523b185daf7fb0dafa002822https://oc.tc/forums/topics/5239058daf7fb06708000191https://oc.tc/ENSIONMANhttps://oc.tc/forums/posts/523b1787af7fb01092002585https://oc.tc/forums/posts/523b1787af7fb01092002585https://oc.tc/forums/topics/52388f49af7fb0413f000e7dhttps://oc.tc/zacharycraft777https://oc.tc/forums/posts/523b1701af7fb02bf300283chttps://oc.tc/forums/posts/523b1701af7fb02bf300283chttps://oc.tc/forums/topics/5237b7d7af7fb0440f0001b9https://oc.tc/ENSIONMANhttps://oc.tc/forums/posts/523b14a8af7fb0ccc5002285https://oc.tc/forums/posts/523b14a8af7fb0ccc5002285https://oc.tc/forums/topics/5237f69daf7fb040dc0006d1https://oc.tc/ENSIONMANhttps://oc.tc/forums/posts/523b141eaf7fb0c73b002647https://oc.tc/forums/posts/523b141eaf7fb0c73b002647https://oc.tc/forums/topics/51bd5e6eba6087d4e60020efhttps://oc.tc/iLiftinghttps://oc.tc/forums/posts/523b1413af7fb0ccc5002270https://oc.tc/forums/posts/523b1413af7fb0ccc5002270https://oc.tc/forums/topics/51de6b74af7fb0a091004b40https://oc.tc/Haxasauroushttps://oc.tc/forums/posts/523b138eaf7fb0fbf9002313https://oc.tc/forums/posts/523b138eaf7fb0fbf9002313https://oc.tc/forums/topics/5196b74ca87858886a003a43https://oc.tc/Shadowbladzhttps://oc.tc/forums/posts/523b1201af7fb056ab002297https://oc.tc/forums/posts/523b1201af7fb056ab002297https://oc.tc/forums/topics/523b0f4daf7fb01dde002803https://oc.tc/1234notty1234https://oc.tc/forums/posts/523b0f4daf7fb01dde002802https://oc.tc/forums/posts/523b0f4daf7fb01dde002802https://oc.tc/forums/topics/52281f11af7fb0e4ed00423dhttps://oc.tc/Eldnickhttps://oc.tc/forums/posts/523b0f1caf7fb046f7002681https://oc.tc/forums/posts/523b0f1caf7fb046f7002681
答案 0 :(得分:0)
使用StringUtils.splitByWholeSeparatorPreserveAllTokens方法
String https = "https://oc.tc/Haxasauroushttps://oc.tc/Haxasauroushttps://oc.tc/Haxasauroushttps://oc.tc/Haxasauroushttps://oc.tc/Haxasauroushttps://oc.tc/Haxasauroushttps://oc.tc/Haxasauroushttps://oc.tc/Haxasauroushttps://oc.tc/Haxasauroushttps://oc.tc/Haxasauroushttps://oc.tc/Haxasaurous";
String[] strArr = StringUtils.splitByWholeSeparatorPreserveAllTokens(
https, "https");
for(String str:strArr) {
sysout(str);
}
答案 1 :(得分:0)
是否您尚未检查对.attr()
的调用是否返回null
或""
?
String links = link.attr("abs:href");
if (links != null && !links.equals("")) {
String[] b = links.split("https");
for (String path : b) {
if (!path.equals(""))
System.out.println(link);
}
}
按"abcdabceabcf"
分割"abc"
后,您会获得数组["", "d", "e", "f"]
。
答案 2 :(得分:0)
我使用以下代码测试您的问题:
import java.util.Arrays;
public class Test {
public static void main(String[] args) {
String str = "https://oc.tc/forums/topics/523b038faf7fb046f700255dhttps://oc.tc/tjandralala";
String[] split = str.split("https");
System.out.println(Arrays.toString(split));
}
}
我得到了以下输出:[, ://oc.tc/forums/topics/523b038faf7fb046f700255d, ://oc.tc/tjandralala]
第一个元素并非一无是处。这是empty string(λ)。基本上java会将你的刺痛视为λhttps://oc.tc/forums/topics/523b038faf7fb046f700255d
。所以第一次拆分是空字符串,然后是s:// ...
如果您需要包含https,请尝试使用StringUtils。
答案 3 :(得分:0)
针对问题中显示的主机地址投放时,JSoup
返回links
为
https://oc.tc/forums/topics/523b25afaf7fb08f4b002479
即。使用“https”
分割时只有1个元素解决方案:在尝试访问特定元素或迭代返回的元素之前,请检查数组的length
。