当我使用udf通过分隔符拆分列中的字符串时,我一直收到错误。我正在使用Scala
val rsplit = udf((refsplit: String) => refsplit.split(":"))
+---------+--------------------+--------------------+
| user| jsites| jsites1|
+---------+--------------------+--------------------+
|123ashish|m.mangahere.co:m....|m.mangahere.co:m....|
|456ashish|m.mangahere2.co:m...|m.mangahere2.co:m...|
| ashish|m.mangahere.co:m....|m.mangahere.co:m....|
+---------+--------------------+--------------------+
不知道这是什么以及如何解决它。
这是我的udf和数据框:
m.manghere.co:m.facebook.com:.msn.com
列jsites看起来像m.manghere.co:m.facebook.com:.msn.com
。我正在尝试使用udf将:
分割为const string SettingsFileName = "settings.xml";
XmlDocument xmlDocument = new XmlDocument();
const string RootNode = "settings";
const string elementName = "Element";
const string attributeName = "Name";
const string attributeValue = "b";
xmlDocument.Load(SettingsFileName);
XElement xElement = XElement.Load(new XmlNodeReader(xmlDocument));
var entry = from element in xElement.Elements(elementName)
where (string)element.Attribute(attributeName) == attributeValue
select element;
string[] values = { "Attribute", "d" };
XElement xElement0 = entry.First<XElement>();
for (int i = 0; i < values.Length; i++)
{
string name = values[i++];
string value = "";
if (i < values.Length)
value = values[i];
xElement0.Attribute(name).Value=value;
}
xmlDocument.LoadXml(xElement.ToString());
xmlDocument.Save(SettingsFileName);
。
我一直收到这个错误
答案 0 :(得分:-1)
org.apache.spark.sql.functions
import org.apache.spark.sql.functions.{col,split}
val df = ???
df.withColumn("split sites",split(col("COLNAME"), "REGEX"))
问题有点旧,希望这有助于其他人。干杯