如何解码< br>在Java中

时间:2015-03-22 04:30:50

标签: java android xml-parsing html-parsing

在网站http://web.mta.info/status/serviceStatus.txt中,某些代码会被编码,例如<br>。我想知道如何将这些标签解码回正常格式,以便我可以解析并阅读它们。接下来的代码就是我目前的代码。

String address = "http://web.mta.info/status/serviceStatus.txt";
XmlPullParserFactory pullParserFactory;
XmlPullParser parser;
HttpClient httpclient;
HttpGet httpget;
URI website;
HttpResponse response;
HttpEntity httpEntity;
InputStream xmlFile;    

//code that just initializes some other variables

private void updater() {
    // try catch to catch any exceptions thrown
    try {
        httpclient = new DefaultHttpClient();

        httpget = new HttpGet(address);
        response = httpclient.execute(httpget);
        httpEntity = response.getEntity();
        xmlFile = httpEntity.getContent();

        pullParserFactory = XmlPullParserFactory.newInstance();
        parser = pullParserFactory.newPullParser();

        parser.setFeature(XmlPullParser.FEATURE_PROCESS_NAMESPACES, false);
        parser.setInput(xmlFile, null);

        parseXML(parser);

    } catch (ClientProtocolException e) {
        e.printStackTrace();
    } catch (IOException e) {
        e.printStackTrace();
    } catch (XmlPullParserException e) {
        // TODO Auto-generated catch block
        e.printStackTrace();
    }
}

parseXML基本上是通过该文件并找到我需要的信息。

1 个答案:

答案 0 :(得分:0)

替换

import cv2 import numpy as np def getsamples(img): x, y, z = img.shape samples = np.empty([x * y, z]) index = 0 for i in range(x): for j in range(y): samples[index] = img[i, j] index += 1 return samples def EMSegmentation(img, no_of_clusters=2): output = img.copy() colors = np.array([[0, 11, 111], [22, 22, 22]]) samples = getsamples(img) em = cv2.ml.EM_create() em.setClustersNumber(no_of_clusters) em.trainEM(samples) means = em.getMeans() covs = em.getCovs() # Known bug: https://github.com/opencv/opencv/pull/4232 x, y, z = img.shape distance = [0] * no_of_clusters for i in range(x): for j in range(y): for k in range(no_of_clusters): diff = img[i, j] - means[k] distance[k] = abs(np.dot(np.dot(diff, covs[k]), diff.T)) output[i][j] = colors[distance.index(max(distance))] return output img = cv2.imread('dinosaur.jpg') output = EMSegmentation(img) cv2.imshow('image', img) cv2.imshow('EM', output) cv2.waitKey(0) cv2.destroyAllWindows() < 使用>

<

>代表< <代表>

这些html实体留在您提到的网站的代码段中,大多数是由于一个错误,这就是该网站逃脱代码段的方式。