所以我有这些字符串,我得到的包含很多我不想要的垃圾数据
"http://v20.lscache8.c.youtube.com/videoplayback?id=271de9756065677e&itag=17&ip=0.0.0.0&ipbits=0&expire=999999999999999999"&sparams=ip,ipbits,expireip,ipbits,expire,id,itag&signature=3DCD3F79E045F95B6AF661765F046FB0440FF01606A42661B3AF6BAF046F012549CC9BA34EBC80A9"
所以基本上我只是想让它通过字符串搜索videoplayback?id = *并且只是在videoplayback之间复制什么?id =和&
271de9756065677e
然后继续通过字符串并以相同的方式抓取签名
所以任何人都可以帮助我了解逻辑和示例如何做到这一点?
答案 0 :(得分:1)
由于您的“包含垃圾数据的字符串”实际上是一个URL,因此您应该使用URL类 看一下教程Parsing a URL
import java.net.*;
import java.io.*;
public class ParseURL {
public static void main(String[] args) throws Exception {
String url = "http://v20.lscache8.c.youtube.com/videoplayback?id=271de9756065677e&itag=17&ip=0.0.0.0&ipbits=0&expire=999999999999999999"&sparams=ip,ipbits,expireip,ipbits,expire,id,itag&signature=3DCD3F79E045F95B6AF661765F046FB0440FF01606A42661B3AF6BAF046F012549CC9BA34EBC80A9";
URL aURL = new URL(url);
System.out.println("protocol = " + aURL.getProtocol());
System.out.println("authority = " + aURL.getAuthority());
System.out.println("host = " + aURL.getHost());
System.out.println("port = " + aURL.getPort());
System.out.println("path = " + aURL.getPath());
System.out.println("query = " + aURL.getQuery());
}
}
输出应为:
protocol = http
authority = v20.lscache8.c.youtube.com:80
host = v20.lscache8.c.youtube.com
port = 80
path = /videoplayback
query = id=271de9756065677e&itag=17&ip=0.0.0.0&ipbits=0&expire=999999999999999999"&sparams=ip,ipbits,expireip,ipbits,expire,id,itag&signature=3DCD3F79E045F95B6AF661765F046FB0440FF01606A42661B3AF6BAF046F012549CC9BA34EBC80A9
要解析查询,请使用URLEncodedUtils
String url = "http://v20.lscache8.c.youtube.com/videoplayback?id=271de9756065677e&itag=17&ip=0.0.0.0&ipbits=0&expire=999999999999999999"&sparams=ip,ipbits,expireip,ipbits,expire,id,itag&signature=3DCD3F79E045F95B6AF661765F046FB0440FF01606A42661B3AF6BAF046F012549CC9BA34EBC80A9";
List<NameValuePair> params = URLEncodedUtils.parse(new URI(url), "UTF-8");
for (NameValuePair param : params) {
System.out.println(param.getName() + "=" + param.getValue());
}
输出应为:
id=271de9756065677e
itag=17
ip=0.0.0.0
ipbits=0
expire=999999999999999999"
sparams=ip,ipbits,expireip,ipbits,expire,id,itag
signature=3DCD3F79E045F95B6AF661765F046FB0440FF01606A42661B3AF6BAF046F012549CC9BA34EBC80A9