Question

我有一个字符矩阵列表，并希望将两个列（lat，lon）转换为factor。我尝试过使用lapply并且它可以工作，但它也重塑了我的数据框架。我尝试过两种方式使用as.factor：一种只在两个所需的列上（不好，将所有其他列作为NA返回），一个在整个数据帧上，但在两个实例中都进行了重新整形。然后我尝试将我的矩阵列表融化回原始的，所需的形状，但认为最好不要创建原始问题而不是在事后修复它。关于如何在没有重塑的情况下转换为因子的任何想法？

尝试cols：

ix <- 5:6
mytest[ix] <- lapply(mytest[ix], as.factor)

尝试整个df

lapply(mytest, as.factor)

示例数据：

list(structure(c("study1", "study1", "study1", "study1", "study1", 
"study1", "study1", "study1", "study1", "study1", "study1", "study1", 
"study1", "study1", "study1", "58", "58", "58", "58", "58", "58", 
"58", "58", "58", "58", "58", "58", "58", "58", "58", "2011-07-13", 
"2011-07-13", "2011-07-13", "2011-07-13", "2011-07-13", "2011-07-13", 
"2011-07-13", "2011-07-13", "2011-07-13", "2011-07-13", "2011-07-13", 
"2011-07-13", "2011-07-13", "2011-07-13", "2011-07-13", "321", 
"329", "323", "324", "61", "326", "6", "60", "49", "10", "7", 
"59", "57", "56", "11", "32.884720435", "32.8841969254545", "32.8835599674286", 
"32.88419565", "32.8837771221667", "32.88411147", "32.883244695", 
"32.8837003266667", "32.8838778530086", "32.8853723146154", "32.8027296698536", 
"32.9164754136842", "32.8853777533333", "32.8854051", "32.802755201875", 
"-117.24062533", "-117.240416713636", "-117.240532619714", "-117.24070002", 
"-117.24038866075", "-117.24022087", "-117.240140015", "-117.239834913333", 
"-117.240522195673", "-117.240133633077", "-117.210527201581", 
"-117.236141991053", "-117.24063566", "-117.23989078", "-117.210382870833"
), .Dim = c(15L, 6L), .Dimnames = list(NULL, c("study", "ID", 
"locDate", "locNumb", "meanLat", "meanLon"))), structure(c("Study2", 
"Study2", "Study2", "Study2", "Study2", "Study2", "Study2", "Study2", 
"Study2", "Study2", "Study2", "Study2", "Study2", "Study2", "59", 
"59", "59", "59", "59", "59", "59", "59", "59", "59", "59", "59", 
"59", "59", "2011-07-12", "2011-07-12", "2011-07-12", "2011-07-12", 
"2011-07-12", "2011-07-12", "2011-07-12", "2011-07-12", "2011-07-12", 
"2011-07-12", "2011-07-12", "2011-07-12", "2011-07-12", "2011-07-12", 
"429", "418", "422", "432", "430", "426", "420", "354", "67", 
"419", "425", "427", "421", "428", "32.86543857", "32.867004565", 
"32.8694241808955", "32.8651107616667", "32.868857725", "32.8693627126536", 
"32.8696329253571", "32.86955278", "32.869014345", "32.8692111971429", 
"32.8694814566667", "32.8696187847619", "32.8698972233333", "32.868283279", 
"-117.254194355", "-117.25283091", "-117.25050148", "-117.254406255417", 
"-117.25133879", "-117.235585179972", "-117.250467514464", "-117.25014399", 
"-117.25006813", "-117.235456126857", "-117.235959423333", "-117.250773722857", 
"-117.250450876667", "-117.2512085715"), .Dim = c(14L, 6L), .Dimnames = list(
NULL, c("study", "ID", "locDate", "locNumb", "meanLat", "meanLon"
    ))))

Answer 1

您可以使用

转换两个矩阵的列表

lapply(mytest, as.data.frame)

结果是两个数据帧的列表。他们所有的专栏都是因素。

Answer 2

# something  <- your data

问题是你没有处理数据框：

sapply(something, class)

因此您需要将数据转换为实际数据框：

something2 = lapply(something, function(x) as.data.frame(x, stringsAsFactors = F))

注意，如果您不介意将其他变量也转换为因子，那么你只是遗漏了stringsAsFactors部分，你已经完成了。但我假设你想把其他变量保留为字符。然后只转换你想要的变量：

for (i in 1:length(something2)) {
  something2[[i]]$meanLat = factor(something2[[i]]$meanLat)
  something2[[i]]$meanLon = factor(something2[[i]]$meanLon)
}

所以现在将两个变量转换为一个因子，让我们检查第一个：

str(something2[[1]])

使用没有熔化的lapply将字符转换为factor

2 个答案: