在SparkR DataFrame中一次重命名多个列

时间:2018-06-13 20:57:10

标签: r apache-spark sparkr

如何一次重命名SparkR DataFrame中的多个列,而不是多次调用withColumnRenamed()?例如,假设我想将下面DataFrame中的列重命名为namebirthdays,如何在不调用withColumnRenamed()两次的情况下执行此操作?

team <- data.frame(name = c("Thomas", "Bill", "George", "Randall"),
  surname = c("Johnson", "Clark", "Williams", "Yosimite"),
  dates = c('2017-01-05', '2017-02-23', '2017-03-16', '2017-04-08'))
team <- createDataFrame(team)

team <- withColumnRenamed(team, 'surname', 'name')
team <- withColumnRenamed(team, 'dates', 'birthdays')

1 个答案:

答案 0 :(得分:2)

标准R方法适用于此处 - 您只需重新分配colnames

colnames(team) <- c("name", "name", "birthdays")
team
SparkDataFrame[name:string, name:string, birthdays:string]

如果您知道订单,则可以跳过完整列表和

colnames(team)[colnames(team) %in% c("surname", "dates")] <- c("name", "birthdays")

你可能想避免重复的名字。