我试图在structframe的select列列表中推断struct的架构并构造一个包含struct字段的列表(用col括起来,替换为:with _ as alias name)。struct fields(properties)是可选的,所以我想根据输入数据构造select语句。
Schema推断:
val listOfProperties = explodeFeatures.schema
.filter(c => c.name == "listOfFeatures")
.flatMap(_.dataType.asInstanceOf[StructType].fields).filter(y => y.name == "properties").flatMap(_.dataType.asInstanceOf[StructType].fields)
.map(_.name).map(x => "col(\"listOfFeatures.properties."+x+"\").as(\"properties_"+x.replace(":","_")+"\")")
以上陈述的结果:(val listOfProperties)
col("type").as("type")
col("listOfFeatures.properties.a").as("properties_A"),
col("listOfFeatures.properties.b:P1").as("properties_b_P1"),
col("listOfFeatures.properties.C:ID").as("properties_C_ID"),
col("listOfFeatures.properties.D:l").as("properties_D_1")
选择声明:
explodeFeatures.select(listOfProperties .head , listOfProperties .tail : _*)
但是上面的语句在运行时无法解析。相反,如果我使用下面的硬编码就成功了。
explodeFeatures.select(
col("type").as("type"),
col("listOfFeatures.properties.a").as("properties_A"),
col("listOfFeatures.properties.b:P1").as("properties_b_P1"),
col("listOfFeatures.properties.C:ID").as("properties_C_ID"),
col("listOfFeatures.properties.D:l").as("properties_D_1"))
由于以下原因构建了一个列表,
需要访问struct变量, 需要重命名struct变量,因为它包含:in column name。
任何人都可以帮我解释为什么硬编码语句可以工作,但不能帮助我查看listOfProperties .head,listOfProperties .tail?
例外:
线程中的异常" main" org.apache.spark.sql.AnalysisException: 无法解决'
col("type")
'给定输入列:[type, listOfFeatures];
答案 0 :(得分:1)
根据评论中的建议,您的变量为Seq[String]
,当传递给select
时,df.select("col(name)")
看起来像col(name)
,这样就可以找到名为name
的列map
。您需要更改上一个val listOfProperties = explodeFeatures.schema
.filter(c => c.name == "listOfFeatures")
.flatMap(_.dataType.asInstanceOf[StructType].fields)
.filter(y => y.name == "properties")
.flatMap(_.dataType.asInstanceOf[StructType].fields)
.map(_.name)
.map(x => col(s"listOfFeatures.properties.${x}").as(s"""properties_${x.replace(":","_")}""" ))
,如下所示:
public void Initialize(InitializationEngine context)
{
var events = ServiceLocator.Current.GetInstance<IContentEvents>();
events.PublishedContent += EventsPublishedContent;
}
private void EventsPublishedContent(object sender, ContentEventArgs e)
{
if (e.Content is myType)
{
var currentPage = e.Content as RatePlanPageType;
var pdfPath = businessLogic.CreatePdf(e.content);
var clone = currentPage.CreateWritableClone();
clone.Property["PdfFiles"].Value = pdfPath;
var contentRepository = ServiceLocator.Current.GetInstance<IContentRepository>();
contentRepository.Save(clone, SaveAction.Save);
}
}
旁注:使用字符串插值。它更清洁了!