使用scala在Spark Dataframe中获取下周的日期

时间:2018-03-21 18:49:16

标签: scala spark-dataframe

我在函数中输入了DateType。如果输入日期是周末,我想排除星期六和星期日并获得下一周的日期,否则它应该给出第二天的日期

实施例: 输入:2017年1月1日星期一输出:1/2 / 2017(周二) 输入:周六3/4/2017输出:2017年3月5日(周一)

我已经完成了https://spark.apache.org/docs/2.0.2/api/java/org/apache/spark/sql/functions.html但是我没有看到现成的功能,所以我认为需要创建它。

到目前为止,我有一些东西:

        public PXSelect<LELettering> Piece;
        public PXSelect<ARRegister> Lines;

protected virtual void LELettering_RowDeleting(PXCache sender, PXRowDeletingEventArgs e)
    {
        // Cancel the lettering by removing every LetteringCD from the ARRegister lines and reverse application paiements
        cancelLettering();
    }

        protected void cancelLettering()
        {
            reverseApplication();
            eraseLetteringCD();
        }

       protected void reverseApplication()
        {
            string refNbr = "";
            List<ARRegister> lines = new List<ARRegister>();
            foreach (ARRegister line in PXSelect<ARRegister, Where<ARRegisterLeExt.lettrageCD, 
                Equal<Required<ARRegisterLeExt.lettrageCD>>>>.Select(this, Piece.Current.LetteringCD))
            {
                if (line.DocType == "PMT") refNbr = line.RefNbr;
                else lines.Add(line);
            }
            ARPaymentEntry graphPmt = getGraphPayment(refNbr, "PMT");
            foreach(ARAdjust line in graphPmt.Adjustments_History.Select())
            {
                graphPmt.Adjustments_History.Current = line;
                graphPmt.reverseApplication.Press();
            }
            graphPmt.release.Press();
            graphPmt.Actions.PressSave();
        }
// Here is my problem
        protected void eraseLetteringCD()
        {
            foreach (var line in Lines.Select())
            {
                line.GetItem<ARRegister>().GetExtension<ARRegisterLeExt>().LettrageCD = null;
                Lines.Current = Lines.Update(line);
            }
            Actions.PressSave();
        }

        protected ARPaymentEntry getGraphPayment(string refNbr, string docType)
        {
            ARPaymentEntry graphPmt = CreateInstance<ARPaymentEntry>();
            ARPayment pmt = PXSelect<ARPayment, Where<ARPayment.refNbr, Equal<Required<ARPayment.refNbr>>,
                                And<ARPayment.docType, Equal<Required<ARPayment.docType>>>>>
                                    .Select(this, refNbr, docType);
            if (pmt == null) throw new PXException(Constantes.errNotFound);
            graphPmt.Document.Current = pmt;
            return graphPmt;
        }

需要帮助才能使其有效并正常工作。

2 个答案:

答案 0 :(得分:2)

将日期用作字符串:

import java.time.{DayOfWeek, LocalDate}
import java.time.format.DateTimeFormatter

// If that is your format date
object MyFormat {
  val formatter = DateTimeFormatter.ofPattern("yyyy-MM-dd")
}

object MainSample {
  import MyFormat._

  def main(args: Array[String]): Unit = {
    import java.sql.Date
    import org.apache.spark.sql.types.{DateType, IntegerType}

    import spark.implicits._  

    import org.apache.spark.sql.types.{ StringType, StructField, StructType }
    import org.apache.spark.sql.functions._

    implicit val spark: SparkSession =
          SparkSession
            .builder()
            .appName("YourApp")
            .config("spark.master", "local")
            .getOrCreate()

    val someData = Seq(
      Row(1,"2013-01-30"),
      Row(2,"2012-01-01")
    )

    val schema = List(StructField("id", IntegerType), StructField("date",StringType))
    val sourceDF = spark.createDataFrame(spark.sparkContext.parallelize(someData), StructType(schema))

    sourceDF.show()

    val _udf = udf { (dt: String) =>
      // Parse your date, dt is a string
      val localDate = LocalDate.parse(dt, formatter)

      // Check the week day and add days in each case
      val newDate = if ((localDate.getDayOfWeek == DayOfWeek.SATURDAY)) {
        localDate.plusDays(2)
      } else if (localDate.getDayOfWeek == DayOfWeek.SUNDAY) {
        localDate.plusDays(1)
      } else {
        localDate.plusDays(1)
      }
      newDate.toString
    }

    sourceDF.withColumn("NewDate", _udf('date)).show()
  }
}

答案 1 :(得分:1)

这是在 spark-daria 中定义的更简单的答案:

def nextWeekday(col: Column): Column = {
  val d = dayofweek(col)
  val friday = lit(6)
  val saturday = lit(7)
  when(col.isNull, null)
    .when(d === friday || d === saturday, next_day(col,"Mon"))
    .otherwise(date_add(col, 1))
}

您总是希望尽可能坚持使用原生 Spark 函数。此 post 更详细地解释了此函数的推导。