重命名数据框列名

时间:2019-05-11 14:39:39

标签: python scala apache-spark apache-spark-sql

我有一个数据框df_one,df_two,如下所示:

df_one.show()

-------------
|Column_Name|
-------------
|NAME       |
|ID         | 
|COUNTRY    |
-------------

df_two.show()

-------------   
|_c0|_c1|_c2|
-------------
|AAA|001|US |
|BBB|002|UK |
|CCC|003|IN |
|DDD|004|FR |
-------------

我正在尝试重命名数据框df_two的列,如下所示:

------------- ----  
|NAME|ID |COUNTRY|
------------------
|AAA |001| US    |
|BBB |002| UK    |
|CCC |003| IN    |
|DDD |004| FR    |
------------------

暂时我创建了seq并获得了上述结果

val newColumn = Seq("NAME", "ID", "COUNTRY")
val df = df_two.toDF(newColumn:_*)

但是现在我必须从df_one中读取column(Column_Name)并分别重命名数据帧df_two的列名称。

我还尝试从df_one读取列值,但是它返回Seq [Any],因此我需要Seq [String]。

在这里给我一些代码。

2 个答案:

答案 0 :(得分:1)

这是Scala中的解决方案。

由于df_one是一个很小的数据集(即使总列数为数千),所以也可以将collect的数据帧作为Array。现在,collect-插入DataFrame将导致ArrayRow

df_one.collect
// res1: Array[org.apache.spark.sql.Row] = Array([NAME], [ID], [COUNTRY])

要解开Row(单个String)中的Row,只需应用df_one.collect.map(_.getString(0)) // res2: Array[String] = Array(NAME, ID, COUNTRY) 方法getString

val df_one = Seq(
  "NAME", "ID", "COUNTRY"
).toDF("Column_Name")

val df_two = Seq(
  ("AAA", "001", "US"),
  ("BBB", "002", "UK"),
  ("CCC", "003", "IN"),
  ("DDD", "004", "FR")
).toDF("_c0", "_c1", "_c2")

val colNames = df_one.collect.map(_.getString(0))

df_two.toDF(colNames: _*).show
// +----+---+-------+
// |NAME| ID|COUNTRY|
// +----+---+-------+
// | AAA|001|     US|
// | BBB|002|     UK|
// | CCC|003|     IN|
// | DDD|004|     FR|
// +----+---+-------+

完全放入

model.updateWish = function(wish) {
    ̶v̶a̶r̶ ̶d̶e̶f̶e̶r̶ ̶=̶ ̶$̶q̶.̶d̶e̶f̶e̶r̶(̶)̶;̶

    var renamedImagePromise = null;
    if (wish.image) {
        // Rename temporary image.public_id to wish_id
        renamedImagePromise = cloudinaryService.renameImage(wish.image.public_id, wish._id)
          .then( function (image) {
            var wishClone = Object.assign({},wish);
            // Update wish clone with renamed image
            wishClone.image = image;
            return wishClone;
        });
    };
    // Wait until renameImagePromise is resolved and send updated wish to server
    return $q.when(renamedImagePromise).then(function(wishWithRenamedImage){
        var wishToPut = wishWithRenamedImage || wish;
        return $http.put(URLS.WISH + "/" + wish._id, wishToPut)
         .then(function (resolve) {
            var wish = resolve.data;
            updateWishlist(wish);
            ̶d̶e̶f̶e̶r̶.̶r̶e̶s̶o̶l̶v̶e̶(̶w̶i̶s̶h̶)̶;̶
            console.info("wish updated", wish);
            return wish;
        });
    });
    ̶r̶e̶t̶u̶r̶n̶ ̶d̶e̶f̶e̶r̶.̶p̶r̶o̶m̶i̶s̶e̶;̶
};

答案 1 :(得分:0)

尝试:

df_two.columns = df_one['Column_Name']