JSoup“wrap”每次都没有按预期工作

时间:2014-10-31 06:53:04

标签: java android image jsoup

我有一个HTML字符串,其中包含文本,图像或地图图像。 HTML是动态生成的。现在,只要有<img>标记,它就应该被<center>标记包装。为了实现这一点,我使用JSoup并将其成功应用于静态图像和文本。

但是,每当我试图发布地图时,HTML都会失去其结构。我不明白发生了什么。普通图像和地图图像都有<img>个标记。该方法如何提供不同的输出?

这是完成工作的方法:

public String wrapImgWithCenter(String html){
    Document doc = Jsoup.parse(html);
    doc.select("img").wrap("<center></center>");
    return doc.html();
}

包装前包含图片和地图图片的原始HTML:

<p dir="ltr"><img src="http://files.parsetfss.com/bcff7108-cbce-4ab8-b5d1-1f82827e6519/tfss-4de5af73-68f1-401d-9feb-1dfde1373cff-file" /></p> 
<p dir="ltr"><a href="15.2993265,74.123996"><img src="http://maps.google.com/maps/api/staticmap?center=15.2993265,74.123996&zoom=15&size=960x540&sensor=false&markers=color:blue%7Clabel:!%7C15.2993265,74.123996" /></a><br /><br /></p> 
<p dir="ltr"><a href="22.572646,88.363895,-25.274398,133.775136"><img src="http://maps.google.com/maps/api/staticmap?center=22.572646,88.363895&zoom=2&size=960x540&markers=22.572646,88.363895%7C-25.274398,133.775136&path=color:0xff0000ff%7Cweight:5%7C22.572646,88.363895%7C-25.274398,133.775136&sensor=false" /></a><br /> </p>

包装后的结果

<p dir="ltr"> </p>
<center> 
 <img src="http://files.parsetfss.com/bcff7108-cbce-4ab8-b5d1-1f82827e6519/tfss-01516245-c773-4765-b542-ebecb964b255-file" /> 
</center>
<br />
<br />
<p></p> 
<p dir="ltr"> </p>
<center> 
 <a href="15.2993265,74.123996"><img src="http://maps.google.com/maps/api/staticmap?center=15.2993265,74.123996&zoom=15&size=960x540&sensor=false&markers=color:blue%7Clabel:!%7C15.2993265,74.123996" /></a> 
</center>
<p></p> 
<p dir="ltr"> </p>
<center> 
 <a href="22.572646,88.363895,-25.274398,133.775136"><img src="http://maps.google.com/maps/api/staticmap?center=22.572646,88.363895&zoom=2&size=960x540&markers=22.572646,88.363895%7C-25.274398,133.775136&path=color:0xff0000ff%7Cweight:5%7C22.572646,88.363895%7C-25.274398,133.775136&sensor=false" /></a> 
</center>
<br /> 
<p></p>

为了比较,

仅包含图片的有效输出:

<html>
 <head></head>
 <body>
  <p dir="ltr">
   <center>
    <img src="http://files.parsetfss.com/bcff7108-cbce-4ab8-b5d1-1f82827e6519/tfss-959467a6-f83f-44c6-b6fc-88ba4f49d900-file" />
   </center><br /></p> 
  <p dir="ltr">
   <center>
    <img src="http://files.parsetfss.com/bcff7108-cbce-4ab8-b5d1-1f82827e6519/tfss-46c38c96-c3b5-402e-a0b4-03209adf5203-file" />
   </center><br /></p> 
  <p dir="ltr">
   <center>
    <img src="http://files.parsetfss.com/bcff7108-cbce-4ab8-b5d1-1f82827e6519/tfss-626ec909-c65e-452c-a341-61a361584eba-file" />
   </center><br /> </p> 
 </body>
</html>

包含文字和图片的有效输出:

<html>
 <head></head>
 <body>
  <p dir="ltr">text </p> 
  <p dir="ltr">
   <center>
    <img src="http://files.parsetfss.com/bcff7108-cbce-4ab8-b5d1-1f82827e6519/tfss-11343a01-7cd2-4f9e-9f9a-025ec3feb828-file" />
   </center><br /> </p> 
 </body>
</html>

++++++++++++++++++++++++++++++++++++

类中负责上述功能的方法:

private void createHtmlWeb(){

        String listOfElements = "null"; // normally found if
                                        // webTextcontains.maps.google.com
        Toast.makeText(getApplicationContext(), "" + mainEditText.getHeight(), Toast.LENGTH_SHORT).show();
        ParseObject postObject = new ParseObject("Post");
        Spannable s = mainEditText.getText();
        String webText = Html.toHtml(s);
        webText = webText.replaceAll("(</?(?:b|i|u)>)\\1+", "$1").replaceAll("</(b|i|u)><\\1>", "");
        // refactoring html
        webText = wrapImgWithCenter(webText);
        // Determine link and favourite types to add favourite a class around
        // it.
        if (webText.contains("a href")) {
            String favourite = "favourite";
            // Parse it into jsoup
            Document doc = Jsoup.parse(webText);
            // Create an array to tackle every type individually as wrap can
            // affect whole body types otherwises.
            Element[] array = new Element[doc.select("a").size()];

            for (int i = 0; i < doc.select("a").size(); i++) {
                if (doc.select("a").get(i) != null) {
                    array[i] = doc.select("a").get(i);
                }
            }

            for (int i = 0; i < array.length; i++) {
                // we don't want to wrap link types. Common part links have is
                // http. Should update for somethng more secure.
                if (array[i].toString().contains("http") == false) {
                    array[i] = array[i].wrap("<a class=" + favourite + "></a>");
                }

            }
            // Log.e("From doc.body html *************** ", " " + doc.body());
            Element element = doc.body();
            Log.e("From element html *************** ", " " + element.html());
            listOfElements = element.html();
        }

        // First need to do a check of the code if iti s a google maps image
        if (webText.contains("maps.google.com")) {
            Document doc = Jsoup.parse(webText); // Parse it into jsoup

            for (int i = 0; i < doc.select("img").size(); i++) {
                if (doc.select("img").get(i).toString().contains("maps.google.com")) {
                    // Get all numbers + full stops + get all numbers
                    Pattern noImage = Pattern.compile("(\\-?\\d+(\\.\\d+)?),(\\-?\\d+(\\.\\d+))+%7C(\\-?\\d+(\\.\\d+)?),(\\-?\\d+(\\.\\d+))");
                    // Gets the URL SRC basically.. almost.. lets try it
                    Matcher matcherer = noImage.matcher(doc.select("img").get(i).toString());

                    // Have two options - multi route or single route
                    if (matcherer.find() == true) {
                        for (int j = 0; j < matcherer.groupCount(); j++) {
                            latitude_to = Double.parseDouble(matcherer.group(1));
                            longitude_to = Double.parseDouble(matcherer.group(3));
                            latitude_from = Double.parseDouble(matcherer.group(5));
                            longitude_from = Double.parseDouble(matcherer.group(7));
                        }

                        String coOrds = "" + latitude_to + "," + longitude_to + "," + latitude_from + "," + longitude_from;
                        Element ele = doc.body();
                        ele.select("img").get(i).wrap("<a href=" + coOrds + "></a>");
                        listOfElements = ele.html();
                        listOfElements = listOfElements.replace("&amp;", "&");

                    } else if (matcherer.find() == false) {
                        noImage = Pattern.compile("(\\-?\\d+(\\.\\d+)?),\\s*(\\-?\\d+(\\.\\d+)?)");
                        matcherer = noImage.matcher(doc.select("img").get(i).toString());

                        Toast.makeText(getApplicationContext(), "Regex Count:" + matcherer.groupCount(), Toast.LENGTH_LONG).show();
                        if (matcherer.find()) {
                            for (int j = 0; j < matcherer.groupCount(); j++) {
                                latitude = Double.parseDouble(matcherer.group(1));
                                parseGeoPoint.setLatitude(latitude);
                                longitude = Double.parseDouble(matcherer.group(3));
                                parseGeoPoint.setLongitude(longitude);
                            }
                        }

                        String coOrds = "" + latitude + "," + longitude;

                        Element ele = doc.body();
                        ele.select("img").get(i).wrap("<a href=" + coOrds + "></a>");
                        listOfElements = ele.html();
                        listOfElements = listOfElements.replace("&amp;", "&");

                    }

                } else {
                    // standard photo
                    Element ele = doc.body();
                    ele.select("img").get(i);
                    listOfElements = ele.html();

                }

            }
            // Put new value in htmlContent
            postObject.put("htmlContent", listOfElements);

        } else {
            postObject.put("htmlContent", webText);
        }

        mainEditText.getViewTreeObserver().addOnGlobalLayoutListener(new ViewTreeObserver.OnGlobalLayoutListener() {

            @Override
            public void onGlobalLayout(){
                // TODO Auto-generated method stub
                Rect r = new Rect();
                mainEditText.getWindowVisibleDisplayFrame(r);

                // int screenHeight = mainEditText.getRootView().getHeight();
                // int heightDifference = screenHeight - (r.bottom - r.top);
            }
        });

        // See if a trip exists
        if (finalTrip != null) {
        }

        // Want to put the location in the location section
        // if parsegeoPoint != null -- old information
        if (latitude != -10000 && longitude != -10000) {
            // Toast.makeText(getApplicationContext(),
            // "Adding in location co-ods: " + latitude + " : " + longitude ,
            // Toast.LENGTH_SHORT).show();
            postObject.put("location", parseGeoPoint);
        }
        postObject.put("type", Post.PostType.HTML.getPostVal());
        postObject.put("user", ParseObject.createWithoutData("_User", user.getObjectId()));

        // Transfer these details
        Intent i = new Intent(getApplicationContext(), WriteStoryAnimation.class);
        i.putExtra("listOfElements", listOfElements);
        i.putExtra("webText", webText);
        i.putExtra("finalTrip", finalTrip);
        i.putExtra("latitude", latitude);
        i.putExtra("longitude", longitude);

        if (mainEditText.length() > 0) {
            startActivity(i);
        } else {
            Toast.makeText(getApplicationContext(), "Your story is empty", Toast.LENGTH_SHORT).show();
        }

        // finish();
        // Toast.makeText(getApplicationContext(), "EditText Sie: " + height +
        // " : " + desiredHeight, Toast.LENGTH_LONG).show();

    }

    // method to refactor html
    public String wrapImgWithCenter(String html){
         Document doc = Jsoup.parse(html);
         //adding center tag before images
            doc.select("img").wrap("<center></center>");
            //adding gap after last p tag
            for (int i =0; i<= 1; i++) {
            doc.select("p").last().after("<br>");
            }

            return doc.html();
    }

1 个答案:

答案 0 :(得分:2)

我已经解决了这个问题。 Fonkap在他的评论中是正确的,有些东西改变了我的输出。我只是改变了wrapImgWithCenter()被调用的地方。

我刚刚更改了createHtmlWeb()方法的最后一个并执行了此操作:

Log.e("listOfElements", listOfElements);
            //refactoring html
            listOfElements = wrapImgWithCenter(listOfElements);
            // Put new value in htmlContent
            postObject.put("htmlContent", listOfElements);

        } else {
            //refactoring html
            webText = wrapImgWithCenter(webText);
            postObject.put("htmlContent", webText);
        }

现在输出符合要求。