如何获取没有HTML标记的文本拆分中添加多个分隔符

时间:2018-03-21 09:45:18

标签: python xpath beautifulsoup ixmldomelement

在XPath之后选择带有类ajaxcourseindentfix的div元素并将其从Prerequisite中分离出来并在先决条件之后给出所有内容。

div = soup.select("div.ajaxcourseindentfix")[0]
" ".join([word for word in div.stripped_strings]).split("Prerequisite: ")[-1]

我的div不仅有先决条件,还有以下分裂点:

  

先决条件
  核心要求
  核心要求

现在,每当我有先决条件时,上面的XPath工作正常,但无论何时出现上述三个,XPath都会失败并给我全文。

有没有办法在XPath中放置多个分隔符?或者我该如何解决?

示例页面

Corequisite网址:http://catalog.fullerton.edu/ajax/preview_course.php?catoid=16&coid=96106&show

先决条件网址:http://catalog.fullerton.edu/ajax/preview_course.php?catoid=16&coid=96564&show

两者:http://catalog.fullerton.edu/ajax/preview_course.php?catoid=16&coid=98590&show

[旧主题] - How to get text which has no HTML tag

1 个答案:

答案 0 :(得分:1)

此代码是您的问题的解决方案,除非您特别需要XPath,我还建议您查看 BeautifulSoup 文档,了解我使用的方法,您可以找到{{3 }}

在这些情况下,

java.lang.ClassNotFoundException: com.microsoft.sqlserver.jdbc.SQLServerDriver at org.apache.catalina.loader.WebappClassLoaderBase.loadClass(WebappClassLoaderBase.java:1907) at org.apache.catalina.loader.WebappClassLoaderBase.loadClass(WebappClassLoaderBase.java:1750) at java.lang.Class.forName0(Native Method) at java.lang.Class.forName(Class.java:264) at au.com.techcreations.core2.Config.getPooledConnection(Config.java:292) at au.com.techcreations.readonly.PurchaseOrderList.Fetch(PurchaseOrderList.java:42) at au.com.techcreations.readonly.PurchaseOrderList.<init>(PurchaseOrderList.java:28) at au.com.techcreations.readonly.PurchaseOrderList.getPurchaseOrderList(PurchaseOrderList.java:33) at au.com.techcreations.ResourcePurchaseOrder.getPurchaseOrder(ResourcePurchaseOrder.java:32) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.glassfish.jersey.server.model.internal.ResourceMethodInvocationHandlerFactory$1.invoke(ResourceMethodInvocationHandlerFactory.java:81) at org.glassfish.jersey.server.model.internal.AbstractJavaResourceMethodDispatcher$1.run(AbstractJavaResourceMethodDispatcher.java:140) at org.glassfish.jersey.server.model.internal.AbstractJavaResourceMethodDispatcher.invoke(AbstractJavaResourceMethodDispatcher.java:158) at org.glassfish.jersey.server.model.internal.JavaResourceMethodDispatcherProvider$TypeOutInvoker.doDispatch(JavaResourceMethodDispatcherProvider.java:195) at org.glassfish.jersey.server.model.internal.AbstractJavaResourceMethodDispatcher.dispatch(AbstractJavaResourceMethodDispatcher.java:101) at org.glassfish.jersey.server.model.ResourceMethodInvoker.invoke(ResourceMethodInvoker.java:353) at org.glassfish.jersey.server.model.ResourceMethodInvoker.apply(ResourceMethodInvoker.java:343) at org.glassfish.jersey.server.model.ResourceMethodInvoker.apply(ResourceMethodInvoker.java:102) at org.glassfish.jersey.server.ServerRuntime$1.run(ServerRuntime.java:237) at org.glassfish.jersey.internal.Errors$1.call(Errors.java:271) at org.glassfish.jersey.internal.Errors$1.call(Errors.java:267) at org.glassfish.jersey.internal.Errors.process(Errors.java:315) at org.glassfish.jersey.internal.Errors.process(Errors.java:297) at org.glassfish.jersey.internal.Errors.process(Errors.java:267) at org.glassfish.jersey.process.internal.RequestScope.runInScope(RequestScope.java:318) at org.glassfish.jersey.server.ServerRuntime.process(ServerRuntime.java:211) at org.glassfish.jersey.server.ApplicationHandler.handle(ApplicationHandler.java:982) at org.glassfish.jersey.servlet.WebComponent.service(WebComponent.java:359) at org.glassfish.jersey.servlet.ServletContainer.service(ServletContainer.java:372) at org.glassfish.jersey.servlet.ServletContainer.service(ServletContainer.java:335) at org.glassfish.jersey.servlet.ServletContainer.service(ServletContainer.java:218) at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:303) at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:208) at org.apache.tomcat.websocket.server.WsFilter.doFilter(WsFilter.java:52) at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:241) at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:208) at org.apache.catalina.filters.CorsFilter.handleNonCORS(CorsFilter.java:458) at org.apache.catalina.filters.CorsFilter.doFilter(CorsFilter.java:177) at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:241) at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:208) at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:219) at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:110) at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:169) at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:103) at org.apache.catalina.valves.AccessLogValve.invoke(AccessLogValve.java:962) at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:116) at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:445) at org.apache.coyote.http11.AbstractHttp11Processor.process(AbstractHttp11Processor.java:1115) at org.apache.coyote.AbstractProtocol$AbstractConnectionHandler.process(AbstractProtocol.java:637) at org.apache.tomcat.util.net.JIoEndpoint$SocketProcessor.run(JIoEndpoint.java:316) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at org.apache.tomcat.util.threads.TaskThread$WrappingRunnable.run(TaskThread.java:61) at java.lang.Thread.run(Thread.java:745) com.microsoft.sqlserver.jdbc.SQLServerDriver .next_element非常有用。 或.next_sibling我们将获得一台发电机,我们将以我们可以操纵发电机的方式转换或使用它。

.next_elements

解决这两个问题,我们不必使用 CSS选择器和那些奇怪的列表操作。一切都是有机有效