我有一个用于插入URL的编辑文本,一个用于午餐HTML解析的按钮和另一个用于显示结果的EditText, 如何用jsoup提取网站源只是链接以.mp4结尾?
这是我的个人资料链接: https://www.instagram.com/p/BEZcgC8/
有两个相同的mp4链接...
<meta property="og:video" content="http://igcdn-videos-h-8-a.akamaihd.net
/hphotos-ak-xal1/t50.2886-16/13053343_16890256565548_842608422_n.mp4"
/>
<meta property="og:video:secure_url" content="https://igcdn-videos-
h-8-a.akamaihd.net/hphotos-ak-xal1/t50.2886-16
/13053343_16890911255689848_842608422_n.mp4" />
<meta property="og:video:type" content="video/mp4" />
我想要这样的结果到EditText https://example.com/ringuser.mp4
xml layout:activity_main.xml
<RelativeLayout xmlns:android="http://schemas.android.com/apk/res
/android"
xmlns:tools="http://schemas.android.com/tools"
android:layout_width="match_parent"
android:layout_height="match_parent"
android:paddingLeft="@dimen/activity_horizontal_margin"
android:paddingRight="@dimen/activity_horizontal_margin"
android:paddingTop="@dimen/activity_vertical_margin"
android:paddingBottom="@dimen/activity_vertical_margin"
tools:context="com.survivingwithandroid.jsoup.MainActivity">
<TextView android:id="@+id/txt1"
android:text="@string/app_name"
android:layout_width="wrap_content"
android:layout_height="wrap_content"
android:layout_centerHorizontal="true"
style="@android:style/TextAppearance.Large"/>
<TextView
android:layout_width="wrap_content"
android:layout_height="wrap_content"
android:layout_below="@id/txt1"
android:layout_marginTop="20dp"
android:text="Website URL"
android:id="@+id/txt2"/>
<EditText
android:layout_width="wrap_content"
android:layout_height="wrap_content"
android:layout_below="@id/txt2"
android:ems="15"
android:id="@+id/edtURL"/>
<Button
android:layout_width="wrap_content"
android:layout_height="wrap_content"
android:layout_below="@id/edtURL"
android:layout_centerHorizontal="true"
android:text="Get data!"
android:layout_marginTop="15dp"
android:id="@+id/btnGo"/>
<TextView
android:layout_width="wrap_content"
android:layout_height="wrap_content"
android:layout_below="@id/btnGo"
android:layout_marginTop="10dp"
android:text="Result data"
android:id="@+id/txt3"/>
<EditText
android:id="@+id/edtResp"
android:layout_width="match_parent"
android:layout_height="wrap_content"
android:layout_below="@+id/txt3"
android:inputType="textMultiLine"
android:lines="6"
android:editable="false"
android:layout_marginTop="10dp"/>
</RelativeLayout>
logcat的:
09-12 19:38:35.148: D/NativeCrypto(22237): ssl=0x52f3ceb0 sslRead
buf=0x41837fd0 len=174,timeo=3000
09-12 19:38:35.149: D/NativeCrypto(22237): Doing SSL_Read()
ssl=0x52f3ceb0, appData=0x52f0eec0
09-12 19:38:35.149: D/NativeCrypto(22237): Returned from SSL_Read()
with result 174, error code 0
ssl=0x52f3ceb0, appData=0x52f0eec0
09-12 19:38:35.233: D/dalvikvm(22237): GC_FOR_ALLOC freed 849K, 25% free
3278K/4324K, paused 12ms,
total 12ms
09-12 19:38:35.309: D/NativeCrypto(22237): NativeCrypto_EVP_VerifyInit
ctx=0x52f2ec48
09-12 19:38:35.309: D/NativeCrypto(22237): NativeCrypto_EVP_VerifyInit
algorithmChars=RSA-SHA1
09-12 19:38:35.393: D/dalvikvm(22237): GC_FOR_ALLOC freed 856K, 21% free
3960K/5012K, paused 15ms,
total 15ms
09-12 19:38:35.551: D/MyTag(22237): Final links
09-12 19:38:36.477: I/SurfaceTextureClient(22237): [STC::queueBuffer]
(this:0x504d5528) fps:2.68,
dur:1491.40, max:498.01, min:102.51
09-12 19:38:37.487: I/SurfaceTextureClient(22237): [STC::queueBuffer]
(this:0x504d5528) fps:1.98,
dur:1009.32, max:512.27, min:497.04
09-12 19:38:38.978: I/SurfaceTextureClient(22237): [STC::queueBuffer]
(this:0x504d5528) fps:2.01,
dur:1491.22, max:497.44, min:496.74
09-12 19:38:39.987: I/SurfaceTextureClient(22237): [STC::queueBuffer]
(this:0x504d5528) fps:1.98,
dur:1009.04, max:512.26, min:496.77
09-12 19:38:41.494: I/SurfaceTextureClient(22237): [STC::queueBuffer]
(this:0x504d5528) fps:1.99,
dur:1507.38, max:513.62, min:495.68
09-12 19:38:42.984: I/SurfaceTextureClient(22237): [STC::queueBuffer]
(this:0x504d5528) fps:2.01,
dur:1489.89, max:497.03, min:496.14
09-12 19:38:43.994: I/SurfaceTextureClient(22237): [STC::queueBuffer]
(this:0x504d5528) fps:1.98,
dur:1009.40, max:513.22, min:496.18
09-12 19:38:45.500: I/SurfaceTextureClient(22237): [STC::queueBuffer]
(this:0x504d5528) fps:1.99,
dur:1506.13, max:512.60, min:496.73
09-12 19:38:46.990: I/SurfaceTextureClient(22237): [STC::queueBuffer]
(this:0x504d5528) fps:2.01,
dur:1490.53, max:497.36, min:496.37
09-12 19:38:48.000: I/SurfaceTextureClient(22237): [STC::queueBuffer]
(this:0x504d5528) fps:1.98,
dur:1009.87, max:513.21, min:496.66
09-12 19:38:49.009: I/SurfaceTextureClient(22237): [STC::queueBuffer]
(this:0x504d5528) fps:1.98,
dur:1008.96, max:512.33, min:496.63
09-12 19:38:50.500: I/SurfaceTextureClient(22237): [STC::queueBuffer]
(this:0x504d5528) fps:2.01,
dur:1491.21, max:497.98, min:495.79
09-12 19:38:51.510: I/SurfaceTextureClient(22237): [STC::queueBuffer]
(this:0x504d5528) fps:1.98,
dur:1009.38, max:512.70, min:496.68
09-12 19:38:53.016: I/SurfaceTextureClient(22237): [STC::queueBuffer]
(this:0x504d5528) fps:1.99,
dur:1505.97, max:512.45, min:496.70
09-12 19:38:55.516: I/SurfaceTextureClient(22237): [STC::queueBuffer]
(this:0x504d5528) fps:1.98,
dur:1009.17, max:512.00, min:497.17
09-12 19:38:34.789: D/NativeCrypto(22237): Doing SSL_Read()
ssl=0x52f3ceb0, appData=0x52f0eec0
09-12 19:38:34.789: D/NativeCrypto(22237): Returned from SSL_Read() with
result 1, error code 0
ssl=0x52f3ceb0, appData=0x52f0eec0
09-12 19:38:57.022: I/SurfaceTextureClient(22237): [STC::queueBuffer]
(this:0x504d5528) fps:1.99,
dur:1505.99, max:509.86, min:497.09
09-12 19:38:58.512: I/SurfaceTextureClient(22237): [STC::queueBuffer]
(this:0x504d5528) fps:2.01,
dur:1490.55, max:498.20, min:495.64
09-12 19:38:59.523: I/SurfaceTextureClient(22237): [STC::queueBuffer]
(this:0x504d5528) fps:1.98,
dur:1010.26, max:513.02, min:497.24
09-12 19:39:00.440: D/OpenGLRenderer(22237): Flushing caches (mode 0)
09-12 19:39:00.471: D/InputMethodManager(22237): deactivate the
inputconnection in
ControlledInputConnectionWrapper.
09-12 19:39:00.497: D/OpenGLRenderer(22237): Flushing caches (mode 0)
09-12 19:39:00.682: D/dalvikvm(22237): GC_FOR_ALLOC freed 1317K, 27%
free 4178K/5692K, paused 21ms,
total 21ms
09-12 19:39:00.716: V/PhoneWindow(22237): DecorView setVisiblity:
visibility = 4
09-12 19:39:00.720: V/PhoneWindow(22237): DecorView setVisiblity:
visibility = 0
09-12 19:39:00.721: W/IInputConnectionWrapper(22237): showStatusIcon on
inactive InputConnection
09-12 19:39:00.724: V/InputMethodManager(22237): Not IME target window,
ignoring
09-12 19:39:00.783: V/InputMethodManager(22237): onWindowFocus:
android.widget.EditText{41a17428
VFED..CL .F....ID 24,127-504,218 #7f09003e app:id/edtURL}
softInputMode=288 first=true flags=#1810100
Davide Pastore的新java,但是当按下按钮时它没有显示任何结果......
public class MainActivity extends ActionBarActivity {
private EditText respText;
@Override
protected void onCreate(Bundle savedInstanceState) {
super.onCreate(savedInstanceState);
setContentView(R.layout.activity_main);
final EditText edtUrl = (EditText) findViewById(R.id.edtURL);
Button btnGo = (Button) findViewById(R.id.btnGo);
respText = (EditText) findViewById(R.id.edtResp);
btnGo.setOnClickListener(new View.OnClickListener() {
@Override
public void onClick(View view) {
String siteUrl = edtUrl.getText().toString();
( new ParseURL() ).execute(new String[]{siteUrl});
}
});
}
@Override
public boolean onCreateOptionsMenu(Menu menu) {
// Inflate the menu; this adds items to the action bar if it is
present.
//getMenuInflater().inflate(R.menu.main, menu);
return true;
}
@Override
public boolean onOptionsItemSelected(MenuItem item) {
// Handle action bar item clicks here. The action bar will
// automatically handle clicks on the Home/Up button, so long
// as you specify a parent activity in AndroidManifest.xml.
int id = item.getItemId();
if (id == R.id.action_settings) {
return true;
}
return super.onOptionsItemSelected(item);
}
private class ParseURL extends AsyncTask<String, Void, String> {
private String finalLinks;
@Override
protected String doInBackground(String... strings) {
StringBuffer buffer = new StringBuffer();
try {
Document doc = Jsoup.connect(strings[0]).get();
Elements mp4Links = doc.select("a[href$=.mp4]");
List<String> links = new ArrayList<String>();
for (Element mp4Link : mp4Links) {
String absHref = mp4Link.attr("abs:href");
links.add(absHref);
}
finalLinks = "";
for (String link : links) {
finalLinks += link + "\n";
}
Log.d("MyTag", "Final links " + finalLinks);
}
catch(Throwable t) {
t.printStackTrace();
}
return buffer.toString();
}
@Override
protected void onPreExecute() {
super.onPreExecute();
}
@Override
protected void onPostExecute(String s) {
super.onPostExecute(s);
respText.setText(finalLinks);
}
}
}
old java
public class MainActivity extends Activity {
// URL Address
String url = "http://www.androidbegin.com";
ProgressDialog mProgressDialog;
@Override
public void onCreate(Bundle savedInstanceState) {
super.onCreate(savedInstanceState);
setContentView(R.layout.activity_main);
// Locate the Buttons in activity_main.xml
Button titlebutton = (Button) findViewById(R.id.titlebutton);
Button descbutton = (Button) findViewById(R.id.descbutton);
Button logobutton = (Button) findViewById(R.id.logobutton);
// Capture button click
titlebutton.setOnClickListener(new OnClickListener() {
public void onClick(View arg0) {
// Execute Title AsyncTask
new Title().execute();
}
});
// Capture button click
descbutton.setOnClickListener(new OnClickListener() {
public void onClick(View arg0) {
// Execute Description AsyncTask
new Description().execute();
}
});
// Capture button click
logobutton.setOnClickListener(new OnClickListener() {
public void onClick(View arg0) {
// Execute Logo AsyncTask
new Logo().execute();
}
});
}
// Title AsyncTask
private class Title extends AsyncTask<Void, Void, Void> {
String title;
@Override
protected void onPreExecute() {
super.onPreExecute();
mProgressDialog = new ProgressDialog(MainActivity.this);
mProgressDialog.setTitle("Android Basic JSoup Tutorial");
mProgressDialog.setMessage("Loading...");
mProgressDialog.setIndeterminate(false);
mProgressDialog.show();
}
@Override
protected Void doInBackground(Void... params) {
try {
// Connect to the web site
Document document = Jsoup.connect(url).get();
// Get the html document title
title = document.title();
} catch (IOException e) {
e.printStackTrace();
}
return null;
}
@Override
protected void onPostExecute(Void result) {
// Set title into TextView
TextView txttitle = (TextView) findViewById(R.id.titletxt);
txttitle.setText(title);
mProgressDialog.dismiss();
}
}
// Description AsyncTask
private class Description extends AsyncTask<Void, Void, Void> {
String desc;
@Override
protected void onPreExecute() {
super.onPreExecute();
mProgressDialog = new ProgressDialog(MainActivity.this);
mProgressDialog.setTitle("Android Basic JSoup Tutorial");
mProgressDialog.setMessage("Loading...");
mProgressDialog.setIndeterminate(false);
mProgressDialog.show();
}
@Override
protected Void doInBackground(Void... params) {
try {
// Connect to the web site
Document document = Jsoup.connect(url).get();
// Using Elements to get the Meta data
Elements description = document
.select("meta[name=description]");
// Locate the content attribute
desc = description.attr("content");
} catch (IOException e) {
e.printStackTrace();
}
return null;
}
@Override
protected void onPostExecute(Void result) {
// Set description into TextView
TextView txtdesc = (TextView) findViewById(R.id.desctxt);
txtdesc.setText(desc);
mProgressDialog.dismiss();
}
}
// Logo AsyncTask
private class Logo extends AsyncTask<Void, Void, Void> {
Bitmap bitmap;
@Override
protected void onPreExecute() {
super.onPreExecute();
mProgressDialog = new ProgressDialog(MainActivity.this);
mProgressDialog.setTitle("Android Basic JSoup Tutorial");
mProgressDialog.setMessage("Loading...");
mProgressDialog.setIndeterminate(false);
mProgressDialog.show();
}
@Override
protected Void doInBackground(Void... params) {
try {
// Connect to the web site
Document document = Jsoup.connect(url).get();
// Using Elements to get the class data
Elements img = document.select("a[class=brand brand-image]
img[src]");
// Locate the src attribute
String imgSrc = img.attr("src");
// Download image from URL
InputStream input = new java.net.URL(imgSrc).openStream();
// Decode Bitmap
bitmap = BitmapFactory.decodeStream(input);
} catch (IOException e) {
e.printStackTrace();
}
return null;
}
@Override
protected void onPostExecute(Void result) {
// Set downloaded image into ImageView
ImageView logoimg = (ImageView) findViewById(R.id.logo);
logoimg.setImageBitmap(bitmap);
mProgressDialog.dismiss();
}
答案 0 :(得分:1)
只需编辑CSS查询即可获得您喜欢的内容。
public class MainActivity extends AppCompatActivity {
private EditText respText;
@Override
protected void onCreate(Bundle savedInstanceState) {
super.onCreate(savedInstanceState);
setContentView(R.layout.activity_main);
final EditText edtUrl = (EditText) findViewById(R.id.edtURL);
Button btnGo = (Button) findViewById(R.id.btnGo);
respText = (EditText) findViewById(R.id.edtResp);
btnGo.setOnClickListener(new View.OnClickListener() {
@Override
public void onClick(View view) {
String siteUrl = edtUrl.getText().toString();
( new ParseURL() ).execute(new String[]{siteUrl});
}
});
}
@Override
public boolean onCreateOptionsMenu(Menu menu) {
// Inflate the menu; this adds items to the action bar if it is present.
//getMenuInflater().inflate(R.menu.main, menu);
return true;
}
@Override
public boolean onOptionsItemSelected(MenuItem item) {
// Handle action bar item clicks here. The action bar will
// automatically handle clicks on the Home/Up button, so long
// as you specify a parent activity in AndroidManifest.xml.
int id = item.getItemId();
if (id == R.id.action_settings) {
return true;
}
return super.onOptionsItemSelected(item);
}
private class ParseURL extends AsyncTask<String, Void, String> {
private String finalLinks;
@Override
protected String doInBackground(String... strings) {
StringBuffer buffer = new StringBuffer();
try {
Document doc = Jsoup.connect(strings[0]).get();
Elements mp4Links = doc.select("meta[content$=.mp4]");
List<String> links = new ArrayList<String>();
for (Element mp4Link : mp4Links) {
String absHref = mp4Link.attr("content");
links.add(absHref);
}
finalLinks = "";
for (String link : links) {
finalLinks += link + "\n";
}
Log.d("MyTag", "Final links " + finalLinks);
}
catch(Throwable t) {
t.printStackTrace();
}
return buffer.toString();
}
@Override
protected void onPreExecute() {
super.onPreExecute();
}
@Override
protected void onPostExecute(String s) {
super.onPostExecute(s);
respText.setText(finalLinks);
}
}
}
一个完整的例子可能是:
public class MainActivity extends AppCompatActivity {
private EditText respText;
@Override
protected void onCreate(Bundle savedInstanceState) {
super.onCreate(savedInstanceState);
setContentView(R.layout.activity_main);
final EditText edtUrl = (EditText) findViewById(R.id.edtURL);
Button btnGo = (Button) findViewById(R.id.btnGo);
respText = (EditText) findViewById(R.id.edtResp);
btnGo.setOnClickListener(new View.OnClickListener() {
@Override
public void onClick(View view) {
String siteUrl = edtUrl.getText().toString();
( new ParseURL() ).execute(new String[]{siteUrl});
}
});
}
@Override
public boolean onCreateOptionsMenu(Menu menu) {
// Inflate the menu; this adds items to the action bar if it is present.
//getMenuInflater().inflate(R.menu.main, menu);
return true;
}
@Override
public boolean onOptionsItemSelected(MenuItem item) {
// Handle action bar item clicks here. The action bar will
// automatically handle clicks on the Home/Up button, so long
// as you specify a parent activity in AndroidManifest.xml.
int id = item.getItemId();
if (id == R.id.action_settings) {
return true;
}
return super.onOptionsItemSelected(item);
}
private class ParseURL extends AsyncTask<String, Void, String> {
private String finalLinks;
@Override
protected String doInBackground(String... strings) {
StringBuffer buffer = new StringBuffer();
try {
Document doc = Jsoup.connect(strings[0]).get();
Elements mp4Links = doc.select("a[href$=.mp4],meta[property=og:video],meta[property=og:video:secure_url]");
List<String> links = new ArrayList<String>();
for (Element mp4Link : mp4Links) {
String absHref = mp4Link.attr("abs:href");
links.add(absHref);
}
finalLinks = "";
for (String link : links) {
finalLinks += link + "\n";
}
Log.d("MyTag", "Final links " + finalLinks);
}
catch(Throwable t) {
t.printStackTrace();
}
return buffer.toString();
}
@Override
protected void onPreExecute() {
super.onPreExecute();
}
@Override
protected void onPostExecute(String s) {
super.onPostExecute(s);
respText.setText(finalLinks);
}
}
}
假设您有这样的HTML:
<html>
<head>
<title>Try jsoup</title>
</head>
<body>
<p>This is <a href="http://jsoup.org/">jsoup</a>.</p>
<a href="https://example.com/ringuser.mp4">mp4 1</a>
<a href="https://example.com/ringuser_2.mp4">mp4 2</a>
<a href="https://example.com/ringuser_3.mp4">mp4 3</a>
<a href="https://example.com/ringuser_4.mp4">mp4 4</a>
<a href="other.html">ciao</a>
</body>
</html>
您可以使用以下代码检索以.mp4
结尾的所有链接:
Elements mp4Links = doc.select("a[href$=.mp4]");
List<String> links = new ArrayList<String>();
for (Element mp4Link : mp4Links) {
String absHref = mp4Link.attr("abs:href");
links.add(absHref);
}
//Do your magic with the links List...
links
将包含:
https://example.com/ringuser.mp4
https://example.com/ringuser_2.mp4
https://example.com/ringuser_3.mp4
https://example.com/ringuser_4.mp4
答案 1 :(得分:0)
要使用jsoup过滤具有特定文件扩展名的HTML超链接,请查看以下代码:
Java代码
String url = "http://www.sample-videos.com/";
Map<String, String> fileMap = new HashMap<String, String>();
String extensionFilter = ".mp4";
try {
Document doc = Jsoup.connect(url)
.userAgent("Mozilla/5.0 (Windows NT 6.3; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/52.0.2743.82 Safari/537.36")
.get();
url = url.endsWith("/") ? url : url + "/";
String fileUrl = null;
for (Element result : doc.select("a")) { // select all <a> tags
if ((fileUrl = result.attr("href")).contains(extensionFilter)) { // filter <a> tags with defined extension
if (!fileUrl.startsWith("http")) {
fileUrl = fileUrl.startsWith("/") ? url + fileUrl.substring(1, fileUrl.length() - 1) : url + fileUrl;
}
fileMap.put(fileUrl, fileUrl.substring(fileUrl.lastIndexOf("/")+1));
}
}
for (String file : fileMap.keySet()) { // do something useful with the extracted mp3 urls, e.g. downloading the files
System.out.println(fileMap.get(file) + "->" + file);
org.apache.commons.io.FileUtils.copyURLToFile(new URL(file), new File(fileMap.get(file)));
}
} catch (IOException e) {
e.printStackTrace();
}
<强>输出强>
big_buck_bunny_240p_2mb.mp4->http://www.sample-videos.com/video/mp4/240/big_buck_bunny_240p_2mb.mp4
big_buck_bunny_720p_2mb.mp4->http://www.sample-videos.com/video/mp4/720/big_buck_bunny_720p_2mb.mp4
big_buck_bunny_480p_2mb.mp4->http://www.sample-videos.com/video/mp4/480/big_buck_bunny_480p_2mb.mp4
big_buck_bunny_720p_10mb.mp4->http://www.sample-videos.com/video/mp4/720/big_buck_bunny_720p_10mb.mp4
...