使用多个变量创建变量

时间:2016-07-26 13:42:51

标签: r

我正在尝试清理数据集并在名称下创建3个变量:Adventure,Action和Comedy。原始数据集有3000个观察值(导入的文件名:dat)。我只展示了一些观察结果

dat1 <- dat %>% separate (Genres, c("Genres1","Genres2" ,"Genres3" ,"Genres4" ,"Genres5" ), sep=",", extra = "drop", fill = "right")


id    Runtime    Genres1    Genres2    Genres3  Genres4  Genres5                                       
37      75       animation  adventure  family   fantasy  musical   
1       162      action     adventure  fantasy  sci_fi       
95      126      action     fantasy   
100     101      comedy     drama      fantasy   
82      136      action     adventure  sci-fi    
99      117      animation  adventure  comedy   family   sport   
91      95       animation  comedy     crime    family

在R中导入数据集后,使用以下R代码将所有类型分为5:

dat1 ["adventure"] <- NA

dat1$adventure <- ifelse(dat1$Genres1=="adventure",1,(ifelse(dat1$Genres2=="adventure",1,0))) 

如何将所有类型分为1类,分别用于动作,冒险和喜剧?

我尝试使用以下代码:

使用

为冒险创建了一个空列
  dat1$adventure <- ifelse((dat1$Genres1=="adventure" | dat1$Genres2=="adventure" | dat1$Genres3=="adventure" | dat1$Genres4=="adventure" ),1, 0)


id    Runtime    Genres1    Genres2    Genres3  Genres4  Genres5  Adventure                                     
37      75       animation  adventure  family   fantasy  musical  0
1       162      action     adventure  fantasy  sci_fi            0
95      126      action     fantasy                               0
100     101      comedy     drama      fantasy                    0
82      136      action     adventure  sci-fi                     0
99      117      animation  adventure  comedy   family   sport    0   
91      95       animation  comedy     crime    family            0

建议将代码缩短为

Genres1

代码能够为Genres2提取冒险,但为dat2 <- c( "adventure", "comedy", "action", "drama", "animation", "fantasy", "mystery", "family", "sci-fi", "thriller", "romance", "horror", "musical","history", "war", "documentary", "biography") 返回零。

我已经回避了这个问题。我尝试了一些建议,但不确定如何进行,因为有3000次观察。

运行建议后

类型列表,向量的形成以及将其分配给dat2

 action   adventure   animation   biography      comedy documentary          drama 
      1           1           1           1           1           1           1 
 family     fantasy     history      horror     musical     mystery     romance 
      1           1           1           1           1           1           1 
 sci-fi    thriller         war 
      1           1           1                                                                   

表(factor(dat2))表(factor(dat2))

 fun1 <- function("adventure", "comedy", "action", "drama", "animation",
"fantasy", "mystery", "family", "sci-fi", "thriller", "romance", "horror", 
"musical","history", "war", "documentary", "biography")) {
 vector_of_cur_genres <- seperate(i, sep = ", ")
 result <- table(factor(vector_of_cur_genres, dat2))
 return(result)
 }  

  # Results         

 fun1 <- function("adventure", "comedy", "action", "drama", 
 "animation", "fantasy", "mystery", "family", "sci-fi", "thriller",  
 "romance", "horror", "musical","history", "war", "documentary", 
 "biography")) {
  Error: unexpected string constant in "fun1 <- function("adventure""
  >   vector_of_cur_genres <- separate(i, sep = ", ")
  Error: Please supply column name
  >   result <- table(factor(vector_of_cur_genres, dat2))
  Error in factor(vector_of_cur_genres, dat2) : 
  object 'vector_of_cur_genres' not found
  >   return(result)
  Error: no function to return from, jumping to top level
   > }
   Error: unexpected '}' in "}"

  mat <- mapply(fun1,dat2$Genres)
       Error in match.fun(FUN) : object 'fun1' not found                                                                                                                                                                                                        

创建函数

options

1 个答案:

答案 0 :(得分:0)

您可以使用表格和因子的混合来获得您想要的内容。首先,您要确保每次拼写的所有类型都完全相同("Adventure" != "adventure")。然后你应该创建一个包含所有可能类型的矢量c("Adventure", "Comedy", "Drama", ...")

然后为每一行调用table(factor(genres, list_of_possible_genres)),它将返回一个计数表。然后你可以用这样的

构造一个矩阵
mat <- mapply(
    function(i) {
        table(factor(separate(i, ...),list_of_possible_genres))
    },df$Genres)
#you want to use the original Data.Frame after import

new.df <- cbind(df,mat) #they should both have the same number of rows here

使单独调用中的...与原始函数中的function (i) ...相同。如果您对各个功能或步骤有什么疑问,我可以在评论中解释。

我在mapply调用lambda中定义了一个函数,这类似于在Python中定义fun1 <- function(string_of_genres)) { vector_of_cur_genres <- seperate(i, sep = ", ") result <- table(factor(vector_of_cur_genres, list_of_possible_genres)) return(result) } mat <- mapply(fun1,df$Genres) 。该函数接受一系列类型并返回一个命名向量,其中包含每种可能类型出现次数的计数。

编辑:

                var set = response.authResponse;

                FB.api('/me', function(response) {

                    $.ajax({
                        url: '/parse/classes/_User',
                        type: 'POST',
                        contentType: 'application/json',
                        data: JSON.stringify({
                            'authData': {
                                'facebook': {
                                    'id': set.userID,
                                    'access_token': set.accessToken,
                                    'expiration_date': set.expiresIn
                                }
                            }

                        }),

                        success: function() {
                            console.log('dfjsf')
                        },
                        dataType: 'json'
                    });
                    console.log('Good to see you, ' + response.name + '.');
                });

                // Now you can redirect the user or do an AJAX request to
                // a PHP script that grabs the signed request from the cookie.
            } else {
                alert('User cancelled login or did not fully authorize.');
            }
        });
        return false;
    };
    window.fbAsyncInit = function() {
        FB.init({
            appId: '798106123623614',
            cookie: true, // This is important, it's not enabled by default
            version: 'v2.2'
        });
    };

    (function(d, s, id) {
        var js, fjs = d.getElementsByTagName(s)[0];
        if (d.getElementById(id)) {
            return;
        }
        js = d.createElement(s);
        js.id = id;
        js.src = "//connect.facebook.net/en_US/sdk.js";
        fjs.parentNode.insertBefore(js, fjs);
    }(document, 'script', 'facebook-jssdk'));
</script>

<div class="fb-login-button" data-max-rows="1" data-size="medium" data-show-faces="true" data-auto-logout-link="true"></div>