我有一个很大的长数据框,其中每个化学成分(20+)都列在“分析物”列中,而相应的数值列在“结果”列中。数据框还具有元数据,例如样品ID,年份,地点描述等。我想将其转换为一个宽数据框,其中“分析物”列中的所有成分都变成单独的列,并且相应的数值在下面列出。某些站点(表示为StationID)具有跨多年的多种分析物的数据。如何将其转化为广泛的数据名人并保持数据结构?
我尝试过dcast,重塑,但我无法使它们工作。我获得的最大成功是通过分析物进行子集设置,并通过“ StationID”和“ Year”加入了所有子集文件。
子集和联接是非常重复的,子集和数据框要联接40多种分析物。我也想不通如何获得循环并应用家庭自动化。最快的方法是什么?
#my dataframe looks like this
view(Chemistry)
StationID Date AnalyteName Results SiteDesc
LA102 2010 Nitrate 0.1 Confined
LA102 2011 Nitrate 0.3 Confined
LA103 2010 Cadmium 0.9 Open
V143 2010 Phosphate 1.51 Confined
V144 2011 Zinc 1.82 Open
LA103 2010 Nitrate 1.42 Open
#I've tried this
chem1 <- Chemistry[Chemistry$AnalyteName=="Nitrate",]
chem2 <- Chemistry[Chemistry$AnalyteName=="Cadmium",]
colnames(chem1) <- c("StationID", "Date", "AnalyteName", "Nitrate",
"SiteDesc")
colnames(chem2) <- c("StationID", "Date", "AnalyteName", "Cadmium",
"SiteDesc")
ChemJoin1 <- join(chem1, chem2, by = c("StationID", "Date"), type = "full")
#I want this:
view(Chemistry)
StationID Date Nitrate Cadmium Phosphate Zinc SiteDec
LA102 2010 0.1 Confined
LA102 2011 0.3 Confined
LA103 2010 1.42 0.9 Open
V143 2010 1.51 Confined
V144 2011 1.82 Open