我一直在处理这个答案link,但我有更具体的需求。
我只需要选择以“cat”开头的列。我无法确定如何根据模式选择列。我不需要过滤数据帧,只需选择名称以模式开头的列。
val transformers: Array[PipelineStage] = df.select("cat*").columns.map(
cname =>
new StringIndexer()
.setInputCol(cname)
.setOutputCol(s"${cname}_index")
)
val stages: Array[PipelineStage] = transformers
val pipeline = new Pipeline().setStages(stages)
val model = pipeline.fit(df)
此代码产生错误:
org.apache.spark.sql.AnalysisException: cannot resolve 'cat*' given input columns: [cat3, cat7, cat25,...
答案 0 :(得分:1)
这很简单。您只需过滤以" cat"开头的列。如下:
PictureBox pictureBoxRain1 = new PictureBox();
pictureBoxRain1.Size = size;
//pictureBoxRain1.Image = (Image)Properties.Resources.kaplja;
pictureBoxRain1.Image = Image.FromFile(@"C:\images\kaplja.png");
//pictureBoxRain1.ImageLocation = pictureBoxRain.I;
//pictureBoxRain1.Image = Graphics.FromImage();
//pictureBoxRain1.InitialImage = Properties.Resources.kaplja;
//pictureBoxRain1.BackgroundImage = Properties.Resources.kaplja;
pictureBoxRain1.Location = new Point(pictureBoxRain.Location.X + pictureBoxGrass.Size.Width + 10, pictureBoxRain.Location.Y);
Controls.Add(pictureBoxRain1);
答案 1 :(得分:0)
为什么要从数据框中进行选择以获取列?为什么不过滤所有名称:
val transformers: Array[PipelineStage] = df.columns.filter(_.startsWith("cat")).map(
cname =>
new StringIndexer()
.setInputCol(cname)
.setOutputCol(s"${cname}_index")
)