我正在尝试获取每个唯一对的相关性,这些是我具有状态1和2计数,年份和年份的列。我想获得每个唯一状态对的年度总值之间的相关性。
我的数据框如下所示。
state1 count year week# (corelation column)
v1 x
v1 x
v1 x
v2 x
v2 x
v3 x
我希望它看起来像这样
state1 State2 count year week# (corelation column)
v1 v2 x
v1 v3 x
v1 v4 x
v2 v3 x
v2 v4 x
v3 v4 x
我希望最终结果返回这样的内容。
State 1 | State 2 | Correlation
Alabama | Alaska | .64
Alabama | Utah | .10
Alabama | Arizona | .20
Alaska | Utah | .59
Alaska | Arizona | .20
Utah | Arazona | .10
我正在从CDC中获取一些疾病数据,这就是chl16- chl18是
temp <- rbind(chl15, chl16, chl17, chl18)
temp[is.na(temp$Chlamydia.trachomatis.infection..Current.week),4] <- 0
datacor <- NULL;
for(i in 1:length(state2)) {
datacor$Year[i] <- temp$MMWR.Year[i]
datacor$State1[i] <- as.character(state2[i])
datacor$State2[i] <- as.character(state2[i])
datacor$corelaiton2015[i] <- print(cor(temp[temp$Reporting.Area == state2[1] & temp$MMWR.Year == 2015, 4], temp[temp$Reporting.Area == state2[i] & temp$MMWR.Year == 2015, 4], use = "complete.obs"))
datacor$corelation2016[i] <- print(cor(temp[temp$Reporting.Area == state2[1] & temp$MMWR.Year == 2016, 4], temp[temp$Reporting.Area == state2[i] & temp$MMWR.Year == 2016, 4], use = "complete.obs"))
datacor$corelation2017[i] <- print(cor(temp[temp$Reporting.Area == state2[1] & temp$MMWR.Year == 2017, 4], temp[temp$Reporting.Area == state2[i] & temp$MMWR.Year == 2017, 4], use = "complete.obs"))
datacor$corelation2018[i] <- print(cor(temp[temp$Reporting.Area == state2[1] & temp$MMWR.Year == 2018, 4], temp[temp$Reporting.Area == state2[i] & temp$MMWR.Year == 2018, 4], use = "complete.obs"))
datacor$Latitude[i] <- loc$LATITUDE[i]
datacor$Longitude[i] <- loc$LONGITUDE[i]
}
datacor <- data.frame(datacor)
我得到一列,显示状态1和状态2,但没有给我唯一的对,它还为我提供了每年第一个州与其余州的相关性,而不是唯一性对。 请帮助。谢谢:)