我有一个网址,想解析并从中提取参数。我的实现基于以下堆栈溢出post
但是我的网址比上面的帖子中使用的网址复杂。看起来像这样:
https://example.com/cdscontent/login?initialURI=https%3A%2F%2Fexample.com%2Fdashboard%2F%3Fportal%3Dmyportal%26LO%3D4%26contentid%3D10007.786471%26viewmode%3Dcontent%26variant%3D%2Fmyportal%2F
如您所见,它具有参数initialURI
,该参数本身是(已编码的)URL,并且其中的参数顺序无法更改。
当我运行org.apache.http.client.utils.URLEncodedUtils#parse
时,它会返回
[initialURI=https://example.com/dashboard/?portal=myportal, LO=4, contentid=10007.786471, viewmode=content, variant=/myportal/]
如您所见,它解析除portal
之外的所有参数。它仍然绑定到https://example.com/dashboard/
,换句话说,我期望这样:
[initialURI=https://example.com/dashboard/, portal=myportal, LO=4, contentid=10007.786471, viewmode=content, variant=/myportal/]
我在这里做错什么吗?还是您认为URLEncodedUtils#parse无法处理这种情况?
您还有其他建议吗?
非常感谢!
尝试的单元测试
public class UrlParserTest {
@Test
public void testParseUrl() throws UnsupportedEncodingException, URISyntaxException {
String url =
"https://www.example.com/cdscontent/login?initialURI=https%3A%2F%2Fwww.example.com%2Fdashboard%2F%3Fportal%3Dmyportal%26LO%3D4%26contentid%3D10007.786471%26viewmode%3Dcontent%26variant%3D%2Fmyportal%2F";
String decoded = URLDecoder.decode(url, "UTF-8");
List<NameValuePair> params = URLEncodedUtils.parse(new URI(decoded), "UTF-8");
System.out.println(params);
}
}
答案 0 :(得分:0)
您具有以下网址(已解码):
https://www.example.com/cdscontent/login?initialURI=https://www.example.com/dashboard/?portal=myportal&LO=4&contentid=10007.786471&viewmode=content&variant=/myportal/
此网址包含主网址:
https://www.example.com/cdscontent/login
具有1个查询参数initialURI
:
https://www.example.com/dashboard/?portal=myportal&LO=4&contentid=10007.786471&viewmode=content&variant=/myportal/
此网址具有多个查询参数(您要查找的参数):
portal=myportal&LO=4&contentid=10007.786471&viewmode=content&variant=/myportal/
第1步:
我们首先必须在查询参数initialURI
中获得网址:
List<NameValuePair> params = URLEncodedUtils.parse(new URI(url), Charset.forName("UTF-8"));
// Find first NameValuePair where the name equals initialURI
Optional<NameValuePair> initialURI = params.stream()
.filter(e -> e.getName().equals("initialURI"))
.findFirst();
System.out.println(initialURI);
此打印:
Optional[initialURI=https://www.example.com/dashboard/?portal=myportal&LO=4&contentid=10007.786471&viewmode=content&variant=/myportal/]
第2步:
现在我们可以获取该URL的查询参数并打印它们:
List<NameValuePair> initialParams = URLEncodedUtils
.parse(new URI(initialURI.get().getValue()), Charset.forName("UTF-8"));
System.out.println(initialParams);
结果是:
[portal=myportal, LO=4, contentid=10007.786471, viewmode=content, variant=/myportal/]
这不完全是您的预期行为,您还希望initialURI=https://example.com/dashboard/
也在列表中。但是,您可以看到这不是查询参数,initialURI
中的整个url(及其查询参数)就是查询参数。