如何使用python硒仅打印任何网站的显示/可见/(显示在屏幕上)文本内容

时间:2019-05-06 05:29:04

标签: javascript python selenium xpath

我只想从任何网站上打印/获取可见文本内容(当前用户所看到的内容)。

我尝试使用多种方法,但我从页面中获取了所有文本,但没有得到预期的文本。

driver = webdriver.Chrome(chrome_options=options) #'CustomerProject-createCustomerProject&/Create'
url = "https://techcrunch.com/"
driver.get(url)
element = driver.find_element_by_xpath(r"//body")
driver.execute_script("return arguments[0].innerText", element) 

有什么方法只能获取可见的文本。

注意:如果解决方案是纯JavaScript,那么欢迎。

1 个答案:

答案 0 :(得分:0)

获取void saveInfoRistoratore(String responseRistoratore) { preferenceHelperRistoratore.putIsLoginRistoratore(true); try { JSONObject jsonObjectRistoratore = new JSONObject(responseRistoratore); if (jsonObjectRistoratore.getString(KEY_SUCCESSRistoratore).equals("truer")) { JSONArray dataArrayRistoratore = jsonObjectRistoratore.getJSONArray("datar"); for (int i = 0; i < dataArrayRistoratore.length(); i++) { JSONObject dataobjRistoratore = dataArrayRistoratore.getJSONObject(i); preferenceHelperRistoratore.putNomeRistoratore(dataobjRistoratore.getString(AndyConstantsRistoratore.ParamsRistoratore.NOMERistoratore)); preferenceHelperRistoratore.putCognomeRistoratore(dataobjRistoratore.getString(AndyConstantsRistoratore.ParamsRistoratore.COGNOMERistoratore)); preferenceHelperRistoratore.putNomeRistorante(dataobjRistoratore.getString(AndyConstantsRistoratore.ParamsRistoratore.RISTORANTEmono)); } } } catch (JSONException eRistoratore) { eRistoratore.printStackTrace(); } } 元素,然后使用public class PreferenceHelperRistoratore { private final String INTRORistoratore = "intro"; private final String NOMERistoratore = "nome"; private final String COGNOMERistoratore = "cognome"; private final String RISTORANTEmono = "ristorante"; private SharedPreferences app_prefsRistoratore; private Context contextRistoratore; PreferenceHelperRistoratore(Context contextRistoratore) { app_prefsRistoratore = contextRistoratore.getSharedPreferences("sharedr", Context.MODE_PRIVATE); this.contextRistoratore = contextRistoratore; } void putIsLoginRistoratore(boolean loginoroutRistoratore) { SharedPreferences.Editor editRistoratore = app_prefsRistoratore.edit(); editRistoratore.putBoolean(INTRORistoratore, loginoroutRistoratore); editRistoratore.apply(); } boolean getIsLoginRistoratore() { return app_prefsRistoratore.getBoolean(INTRORistoratore, false); } void putNomeRistoratore(String loginoroutRistoratore) { SharedPreferences.Editor editRistoratore = app_prefsRistoratore.edit(); editRistoratore.putString(NOMERistoratore, loginoroutRistoratore); editRistoratore.apply(); } public String getNomeRistoratore() { return app_prefsRistoratore.getString(NOMERistoratore, ""); } void putCognomeRistoratore(String loginoroutRistoratore) { SharedPreferences.Editor editRistoratore = app_prefsRistoratore.edit(); editRistoratore.putString(COGNOMERistoratore, loginoroutRistoratore); editRistoratore.apply(); } public String getCognomeRistoratore() { return app_prefsRistoratore.getString(COGNOMERistoratore, ""); } void putNomeRistorante(String loginroutRistoratore){ SharedPreferences.Editor editRistoratore = app_prefsRistoratore.edit(); editRistoratore.putString(RISTORANTEmono, loginroutRistoratore); editRistoratore.apply(); } public String getNOMERistorante (){ return app_prefsRistoratore.getString(RISTORANTEmono,""); } } 方法获取该元素的文本。

尝试一下:

//LOGIN
    @SuppressLint("StaticFieldLeak")
    private void loginRistoratore() throws IOException, JSONException {

        if (!AndyUtilsRistoratore.isNetworkAvailableRistoratore(r_start.this)) {
            Toast.makeText(r_start.this, "Internet is required!", Toast.LENGTH_SHORT).show();
            return;
        }
        AndyUtilsRistoratore.showSimpleProgressDialogRistoratore(r_start.this);
        final HashMap<String, String> map = new HashMap<>();
        map.put(AndyConstantsRistoratore.ParamsRistoratore.IDRistoratore, editUserId.getText().toString());
        map.put(AndyConstantsRistoratore.ParamsRistoratore.CELLRistoratore, edtPhone.getText().toString());
        new AsyncTask<Void, Void, String>(){
            protected String doInBackground(Void[] params) {
                String response="";
                try {
                    HttpRequestRistoratore req = new HttpRequestRistoratore(AndyConstantsRistoratore.ServiceTypeRistoratore.LOGINRistoratore);
                    response = req.prepareRistoratore(HttpRequestRistoratore.Method.POST).withDataRistoratore(map).sendAndReadStringRistoratore();
                } catch (Exception e) {
                    response=e.getMessage();
                }
                return response;
            }
            protected void onPostExecute(String result) {
                //do something with response
                Log.d("newwwss", result);
                onTaskCompletedRistoratore(result,LoginTaskRistoratore);
            }
        }.execute();
    }

    private void onTaskCompletedRistoratore(String response,int task) {
        Log.d("responsejson", response.toString());
        AndyUtilsRistoratore.removeSimpleProgressDialogRistoratore();  //will remove progress dialog
        switch (task) {
            case LoginTaskRistoratore:
                if (parseContent.isSuccessRistoratore(response)) {
                    parseContent.saveInfoRistoratore(response);
                    Toast.makeText(r_start.this, "Accesso eseguito", Toast.LENGTH_SHORT).show();
                    Intent intent = new Intent(r_start.this,RistoratoreHome.class);
                    intent.addFlags(Intent.FLAG_ACTIVITY_CLEAR_TASK | Intent.FLAG_ACTIVITY_NEW_TASK);
                    startActivity(intent);
                    this.finish();
                }else {
                    Toast.makeText(r_start.this, parseContent.getErrorMessageRistoratore(response), Toast.LENGTH_SHORT).show();
                }
        }
    }

如果您猜测body文本在文档中不可见,但是由于其存在于页面中而出现在结果中。如果您尝试.text并复制文本,将会得到相同的结果。您甚至可以使用driver.get("https://techcrunch.com/") element = driver.find_element_by_tag_name("body") print(element.text) 在页面中搜索文本。

您看不到文本的原因是使用(opens in a new window)剪切了文本。

  

clip-path CSS属性创建一个剪切区域,该区域设置应显示元素的哪一部分。显示该区域内部的部分,而隐藏外部的部分。