我有这样的UDF
case class bodyresults(text:String,code:String)
val bodyudf = udf{ (body: String) =>
//Appending body tag explicitly to the xml before parsing
val xmlElems = xml.XML.loadString(s"""<?xml version="1.0" encoding="utf-8"?> <!DOCTYPE body [<!ENTITY nbsp " ">]><body>${body}</body>""")
// extract the code inside the req
val code = (xmlElems \\ "body" \\"code").text
val text = (xmlElems \\ "body").text.replace(s"${code}" ,"" )
bodyresults(text, code)
}
我正在尝试将Body字符串转换为代码,文本字符串
CODE:在名为code。的 TEXT:其他所有。 列体类型为String,内容如下所示 我正在尝试使用以下命令 这会导致错误 但正如你在UDF中看到的那样,我附加了body标签并将其关闭。 注意:令人惊讶的是,如果执行以下命令 但如果我使用超过19的任何数字导致错误 以防我在第20行附加正文字符串 我无法弄清楚这个错误的原因是什么。我无法在网上找到相关信息,请让我知道什么是导致错误的? 编辑: 我删除了第20行,因为该字符串缺少结束标记。
但现在错误发生在第19行。 我已经把第19行中的字符串直接传递给了函数,它的工作正常。
但是,当我将整个列传递给UDF时,它无法正常工作? <p>I want to use a track-bar to change a form's opacity.</p>
<p>This is my code:</p>
<pre><code>decimal trans = trackBar1.Value / 5000;
this.Opacity = trans;
</code></pre>
<p>When I build the application, it gives the following error:</p>
<blockquote>
<p>Cannot implicitly convert type 'decimal' to 'double'.</p>
</blockquote>
<p>I tried using <code>trans</code> and <code>double</code> but then the
control doesn't work. This code worked fine in a past VB.NET project. </p>
,While applying opacity to a form should we use a decimal or double value?
val posts5=posts4.withColumn("codetext",bodyudf(col("Body")))
posts5.select("codetext").show()
org.apache.spark.SparkException: Failed to execute user defined function($anonfun$1: (string) => struct<text:string,code:string>)
Caused by: org.xml.sax.SAXParseException; lineNumber: 1; columnNumber: 129; The element type "body" must be terminated by the matching end-tag "</body>"
posts5.select("codetext").show(19)
+--------------------+
| codetext|
+--------------------+
|[Given a represe...|
|[Is there any sta...|
|[What is the diff...|
|[How do I store b...|
|[If I have a trig...|
|[How do you page ...|
|[Does anyone know...|
|[Does anybody kno...|
|[What are some gu...|
|[There are severa...|
|[I wrote a window...|
|[How do I format ...|
|[One may not alwa... |
|[Are PHP variable...|
|[What's the simpl...|
|[Does anyone know...|
|[I'm looking for ...|
|[What is the corr...|
|[I was wondering ...|
+--------------------+
posts5.select("codetext").show(20)
or
posts5.select("codetext").show()
<p>I have a Queue<T> object that I have initialised to a capacity of 2, but obviously that is just the capacity and it keeps expanding as I add items. Is there already an object that automatically dequeues an item when the limit is reached, or is the best solution to create my own inherited class?</p>,Limit size of Queue<T> in .NET?
posts5.select("codetext").show(18) //18 or below works fine
posts5.select("codetext").show(19) // does not work