我正在尝试清理数据集并在名称下创建3个变量:Adventure,Action和Comedy。原始数据集有3000个观察值(导入的文件名:dat)。我只展示了一些观察结果
dat1 <- dat %>% separate (Genres, c("Genres1","Genres2" ,"Genres3" ,"Genres4" ,"Genres5" ), sep=",", extra = "drop", fill = "right")
id Runtime Genres1 Genres2 Genres3 Genres4 Genres5
37 75 animation adventure family fantasy musical
1 162 action adventure fantasy sci_fi
95 126 action fantasy
100 101 comedy drama fantasy
82 136 action adventure sci-fi
99 117 animation adventure comedy family sport
91 95 animation comedy crime family
在R中导入数据集后,使用以下R代码将所有类型分为5:
dat1 ["adventure"] <- NA
dat1$adventure <- ifelse(dat1$Genres1=="adventure",1,(ifelse(dat1$Genres2=="adventure",1,0)))
如何将所有类型分为1类,分别用于动作,冒险和喜剧?
我尝试使用以下代码:
使用
为冒险创建了一个空列 dat1$adventure <- ifelse((dat1$Genres1=="adventure" | dat1$Genres2=="adventure" | dat1$Genres3=="adventure" | dat1$Genres4=="adventure" ),1, 0)
id Runtime Genres1 Genres2 Genres3 Genres4 Genres5 Adventure
37 75 animation adventure family fantasy musical 0
1 162 action adventure fantasy sci_fi 0
95 126 action fantasy 0
100 101 comedy drama fantasy 0
82 136 action adventure sci-fi 0
99 117 animation adventure comedy family sport 0
91 95 animation comedy crime family 0
建议将代码缩短为
Genres1
代码能够为Genres2
提取冒险,但为dat2 <- c( "adventure", "comedy", "action", "drama", "animation", "fantasy", "mystery", "family", "sci-fi", "thriller", "romance", "horror", "musical","history", "war", "documentary", "biography")
返回零。
我已经回避了这个问题。我尝试了一些建议,但不确定如何进行,因为有3000次观察。
运行建议后
action adventure animation biography comedy documentary drama
1 1 1 1 1 1 1
family fantasy history horror musical mystery romance
1 1 1 1 1 1 1
sci-fi thriller war
1 1 1
表(factor(dat2))表(factor(dat2))
fun1 <- function("adventure", "comedy", "action", "drama", "animation",
"fantasy", "mystery", "family", "sci-fi", "thriller", "romance", "horror",
"musical","history", "war", "documentary", "biography")) {
vector_of_cur_genres <- seperate(i, sep = ", ")
result <- table(factor(vector_of_cur_genres, dat2))
return(result)
}
# Results
fun1 <- function("adventure", "comedy", "action", "drama",
"animation", "fantasy", "mystery", "family", "sci-fi", "thriller",
"romance", "horror", "musical","history", "war", "documentary",
"biography")) {
Error: unexpected string constant in "fun1 <- function("adventure""
> vector_of_cur_genres <- separate(i, sep = ", ")
Error: Please supply column name
> result <- table(factor(vector_of_cur_genres, dat2))
Error in factor(vector_of_cur_genres, dat2) :
object 'vector_of_cur_genres' not found
> return(result)
Error: no function to return from, jumping to top level
> }
Error: unexpected '}' in "}"
mat <- mapply(fun1,dat2$Genres)
Error in match.fun(FUN) : object 'fun1' not found
options
答案 0 :(得分:0)
您可以使用表格和因子的混合来获得您想要的内容。首先,您要确保每次拼写的所有类型都完全相同("Adventure" != "adventure"
)。然后你应该创建一个包含所有可能类型的矢量c("Adventure", "Comedy", "Drama", ...")
。
然后为每一行调用table(factor(genres, list_of_possible_genres))
,它将返回一个计数表。然后你可以用这样的
mat <- mapply(
function(i) {
table(factor(separate(i, ...),list_of_possible_genres))
},df$Genres)
#you want to use the original Data.Frame after import
new.df <- cbind(df,mat) #they should both have the same number of rows here
使单独调用中的...
与原始函数中的function (i) ...
相同。如果您对各个功能或步骤有什么疑问,我可以在评论中解释。
我在mapply调用lambda
中定义了一个函数,这类似于在Python中定义fun1 <- function(string_of_genres)) {
vector_of_cur_genres <- seperate(i, sep = ", ")
result <- table(factor(vector_of_cur_genres, list_of_possible_genres))
return(result)
}
mat <- mapply(fun1,df$Genres)
。该函数接受一系列类型并返回一个命名向量,其中包含每种可能类型出现次数的计数。
编辑:
var set = response.authResponse;
FB.api('/me', function(response) {
$.ajax({
url: '/parse/classes/_User',
type: 'POST',
contentType: 'application/json',
data: JSON.stringify({
'authData': {
'facebook': {
'id': set.userID,
'access_token': set.accessToken,
'expiration_date': set.expiresIn
}
}
}),
success: function() {
console.log('dfjsf')
},
dataType: 'json'
});
console.log('Good to see you, ' + response.name + '.');
});
// Now you can redirect the user or do an AJAX request to
// a PHP script that grabs the signed request from the cookie.
} else {
alert('User cancelled login or did not fully authorize.');
}
});
return false;
};
window.fbAsyncInit = function() {
FB.init({
appId: '798106123623614',
cookie: true, // This is important, it's not enabled by default
version: 'v2.2'
});
};
(function(d, s, id) {
var js, fjs = d.getElementsByTagName(s)[0];
if (d.getElementById(id)) {
return;
}
js = d.createElement(s);
js.id = id;
js.src = "//connect.facebook.net/en_US/sdk.js";
fjs.parentNode.insertBefore(js, fjs);
}(document, 'script', 'facebook-jssdk'));
</script>
<div class="fb-login-button" data-max-rows="1" data-size="medium" data-show-faces="true" data-auto-logout-link="true"></div>