我在更新我的data.frame时遇到了一些问题,我不知道如何这样做。我有一个更新我的data.frame的函数:
# Tries to update the data.frame.
updateTable <- function(sample) {
sampleName = sample[1]
sex = sample[2]
dob = sample[3]
cat("UPDATE ENTRY:
Current sample: ",sampleName,"
Sex : ",sex,"
Day of Birth : ",dob," (yyyy-mm-dd)
")
age = getAge(as.Date(dob))
occurences = which(test_data[,"Name"] == sampleName)
test_data[occurences,"Age"] <- age
test_data[occurences,"Sex"] <- sex
# I tried this, but it returns n number of data.frames in the test_patients list.
#return(test_data)
# And this returns a list with data.frames for each test_patient.
#test_data[which(sampleName == test_data$Name),] <-c(Name=sampleName, Sex=sex, Age=age)
# I want to return one data.frame, containing the updated information for each test_patient.
}
一个函数,用于计算给出其出生日期和当前日期的人的年龄:
# Calculates the age of a person given his/her birthdate.
getAge <- function(dob)
{
currentDate = as.Date("2016-12-14")
lt <- data.frame(dob, currentDate)
age <- as.numeric(format(lt[,2],format="%Y")) - as.numeric(format(lt[,1],format="%Y"))
dayOncurrentDateYear <- ifelse(format(lt[,1],format="%m-%d")!="02-29",
as.Date(paste(format(lt[,2],format="%Y"),"-",format(lt[,1],format="%m-%d"),sep="")),
ifelse(as.numeric(format(currentDate,format="%Y")) %% 400 == 0 | as.numeric(format(currentDate,format="%Y")) %% 100 != 0 & as.numeric(format(currentDate,format="%Y")) %% 4 == 0,
as.Date(paste(format(lt[,2],format="%Y"),"-",format(lt[,1],format="%m-%d"),sep="")),
as.Date(paste(format(lt[,2],format="%Y"),"-","02-28",sep=""))))
age[which(dayOncurrentDateYear > lt$currentDate)] <- age[which(dayOncurrentDateYear > lt$currentDate)] - 1
return(age)
}
现在输入数据:
test_data <- data.frame(Name=c("Anita", "Bert", "Cornel"), Sex=c(NA), Age=c(NA))
test_patients <- list( c("Anita", 0, "2000-01-01"), c("Bert", 1, "1959-01-01"), c("Cornel", 1, "1960-01-01") )
test_data = lapply(test_patients, updateTable)
现在我对如何实现目标有一些想法,但是我想知道这样做的方法是什么?我对R不是很有经验,没有书,我想我为什么不在这里问我的问题。
类似的东西(这不起作用):
test_data = lapply(test_patients, function(patient) {
tmp.df = NULL
tmp.df = updateTable(patient)
rbind.data.frame(test_data[which(sampleName == test_data$Name),], tmp.df)
})
所以,亲爱的互联网。谁可以教我如何处理这件事?
亲切的问候
test_data是我原始的子集,所以这里是一个稍微扩展的子集。我正在努力将denrou's answer测试用例的答案与我自己的数据联系起来。
test_data <- data.frame(Name=c(rep(c("Anita", "Bert", "Cornel"),4)), Sex=c(NA), Age=c(NA), Sample_ID=c(rep(1:12,1)), Time=c(rep(1:4,3)) )
test_data[order(test_data[,"Time"]),]
Name Sex Age Sample_ID Time
1 Anita NA NA 1 1
5 Bert NA NA 5 1
9 Cornel NA NA 9 1
2 Bert NA NA 2 2
6 Cornel NA NA 6 2
10 Anita NA NA 10 2
3 Cornel NA NA 3 3
7 Anita NA NA 7 3
11 Bert NA NA 11 3
4 Anita NA NA 4 4
8 Bert NA NA 8 4
12 Cornel NA NA 12 4
那么,如果我不是在创建基于test_patients向量的新data.frame但是希望test_data data.frame中存储test_patients的信息之后如何实现她/他的答案?
自从做:
library(dplyr)
test_patients <- list( c("Anita", 0, "2000-01-01"), c("Bert", 1, "1959-01-01"), c("Cornel", 1, "1960-01-01") )
# This function take a vector and returns a dataframe
to_dataframe <- function(info) data_frame(Name = info[1], Sex = info[2], Birthdate = info[3])
# Now I can turn your patient list into a dataframe
test_data <- lapply(test_patients, to_dataframe) %>% bind_rows()
# And I can calculate the age of a patient with your function
test_data <- test_data %>%
mutate(Age = getAge(as.Date(Birthdate))) %>%
select(Name, Sex, Age)
导致:
# A tibble: 3 × 3
Name Sex Age
<chr> <chr> <dbl>
1 Anita 0 9
2 Bert 1 9
3 Cornel 1 9
如果我不再有意义,请告诉我..我发现很难描述这些事情..
对于任何在这个问题上磕磕绊绊的人,寻找答案; 我不能给你一个申请家庭功能的人,但这里有什么可以帮助你:
for (patient in test_patients) {
# Set variables.
sampleName = patient[1]
sex = patient[2]
dob = patient[3]
age = getAge(as.Date(dob))
# Set reference.
occurrences = which(test_data$Name == sampleName)
# Update table.
test_data[occurrences,"Age"] <- age
test_data[occurrences,"Sex"] <- sex
}
答案 0 :(得分:0)
我就是这样做的:
library(dplyr)
test_patients <- list( c("Anita", 0, "2000-01-01"), c("Bert", 1, "1959-01-01"), c("Cornel", 1, "1960-01-01") )
# This function take a vector and returns a dataframe
to_dataframe <- function(info) data_frame(Name = info[1], Sex = info[2], Birthdate = info[3])
# Now I can turn your patient list into a dataframe
test_data <- lapply(test_patients, to_dataframe) %>% bind_rows()
# And I can calculate the age of a patient with your function
test_data <- test_data %>%
mutate(Age = getAge(as.Date(Birthdate))) %>%
select(Name, Sex, Age)
修改强>
最后一个命令,即select(Name, Sex, Age)
选择参数中给出的列(参见?dplyr::select
)。您可以完美地修改此选项,以便只选择所需的列或将其删除,以便保留所有内容:
test_data <- test_data %>%
mutate(Age = getAge(as.Date(Birthdate)))