我刚刚开始学习R,这非常有用,我试图用它来计算所涵盖的天数比例。该指标与衡量一个人对药物的依从性有关。基本上,在给定的时间段内,您会发现药物的所有填充物都会记录填充日期和供应中的天数,以确定它们所涵盖的天数。例如。如果一个人在2016年2月1日获得35天的填写,他们的报道范围为2016年1月1日至2016年3月6日。很容易。
当他们在第一次填充时用完填充物之前回去填充时会变得棘手,你不会计算天数(例如,这个人在2016年3月1日获得第二次填充,3 / 1-3 / 6只计算一次。)
我实际上已经编写了一些似乎正常工作的代码,但它使用了FOR循环,我已经学会了在R中工作得很好而我很担心当我开始向它投掷大量数据时。
以下是构建测试数据并初始化一些变量的代码的第一部分:
#Create test data vectors
Person <- c(rep("Person1",12),rep("Person2",9))
FillDate <- c("2016-1-1", "2016-2-1", "2016-3-1", "2016-4-1", "2016-5-1", "2016-6-1", "2016-7-1", "2016-8-1", "2016-9-1", "2016-10-1", "2016-11-1", "2016-12-1", "2016-2-1", "2016-3-1", "2016-4-20", "2016-5-1", "2016-6-1", "2016-7-1", "2016-8-1", "2016-9-1", "2016-10-1")
DaysSupply <- c(rep("35", 14), "20", "5", "20", rep("35", 4))
#Build into data.frame
PDCTestData <- cbind.data.frame(as.factor(Person),as.Date(FillDate,"%Y-%m-%d"),as.numeric(DaysSupply))
colnames(PDCTestData) <- c("Person","FillDate","DaysSupply")
#Create start and end dates for overall period
StartDate <- as.Date("2016-01-01")
EndDate <- as.Date("2016-12-31")
#Initialize DaysCoveredList, a vector to hold the list of dates that a person has drug coverage
DaysCoveredList <- NULL
#Initialize DaysCoveredTable, a matrix to count the total number of unique days in the DaysCovered List, by person
DaysCoveredTable <- NULL
以及完成实际工作的第二部分:
#Begin looping through individuals
for(p in levels(PDCTestData$Person)){
#Begin looping through drug fills
for(DrugSpan in 1:nrow(PDCTestData[PDCTestData$Person == p,])){
#Create a sequence of the dates covered by that fill, the sequence starts on the fill date and runs for the number of days in Days Supply, Builds a list of all days covered for that person
DaysCoveredList <- c(DaysCoveredList,seq.Date(from = PDCTestData[PDCTestData$Person == p,][DrugSpan,]$FillDate, length.out = PDCTestData[PDCTestData$Person == p,][DrugSpan,]$DaysSupply, by = "day"))
} #Exit drug fill loop
#Counts the number of unique days covered from the DaysCovredList, with in the start and end of the overall period
DaysCovered <- length(unique(DaysCoveredList[DaysCoveredList >= StartDate & DaysCoveredList <= EndDate]))
#Adds the unique count from DaysCovered to the summary DaysCoveredTable
DaysCoveredTable <- rbind(DaysCoveredTable,cbind(p,DaysCovered))
#Clear DaysCovered and DaysCovredList
DaysCovered <- NULL
DaysCoveredList <- NULL
} #Exit the individual loop
感谢您提供的任何帮助。
感谢。
答案 0 :(得分:0)
library(lubridate)
ptd <- PDCTestData # I get bored writing long variable names
ptd$EndDate <- ptd$FillDate + ptd$DaysSupply
ptd$DrugInterval <- interval(ptd$FillDate, ptd$EndDate)
all_days <- as.Date(StartDate:EndDate, origin = "1970-01-01")
lapply(unique(ptd$Person), function (y) sum(sapply(all_days, function (x) any(x %within% ptd$DrugInterval[ptd$Person==y]))))
不保证速度,但可能更容易阅读。