我正在努力为df附加从该df的子集计算出的新变量。
我有两个目标:
附件数据是已合并数据集的外观示例,但是仅包含2个(SID:9003和1028)。
我需要遍历每个唯一的SID,并对数据框中的某些变量执行几个相对简单的计算,以便为其创建新变量并为其添加新变量。每个SID列中的每个列都应该有一个唯一值。
目前,对于不涉及子集df的变量,我能够成功完成此操作。例如,变量“ numPR”,“ numDef”和“ numCoop”对于每个SID都应存在(当我注释掉循环的Ingroup部分时)。但是,当我尝试对df进行子集化并仅查看满足给定条件的行时(我正在使用子集函数),我收到此错误:
df $ numDef_IG [j]中的错误<-numDef_IG:替换的长度为零
此外:警告消息:
1:在if(df $ numCoop [i] <1){中:条件的长度> 1并且 将仅使用第一个元素
2:在numDefCoop_IG [j] <-合计(x = numIG中,由= list(unique.numValues = numIG $ Player_move),:要处理的项目数 replace不是替换长度的倍数
我觉得内部的for循环正在尝试访问子集数据,并且我还觉得必须有比一系列for循环更优雅的解决方案来实现这一目标。我还有另外一些注释过的变量/列(从〜第86行开始),它们也基于df的子集创建。
对此将提供任何帮助。
数据:stackEx.csv
代码:
#Loading data
file = "stackEx.csv"
df = read.csv(file, header = T)
numIG = subset(df, OppGroupCode == 1, select = c("SID", "Player_move"))
numDefCoop_IG = NA
for (sid in unique(df$SID)) {
i = df$SID == sid # create a logical index
#Getting # of PRs in game per SID.
numPR = sum(df$PreviewRound[i]) # subet the data based on the index
df$numPR[i] = numPR # assign the values only to those selected rows
#__Creating new variables and adding to dataset----
#____Decisions, Overall----
#_____ defections----
numDef=sum(df$Player_move[i])
df$numDef[i] = numDef
#_____cooperations----
numCoop=length(df$Player_move[i]) - df$numDef[i]
df$numCoop[i] = numCoop
if (df$numCoop[i] < 1){
df$numCoop[i] = 0
}
else df$numCoop[i] = df$numCoop[i]
#_____Ingroup----
#unique.numValues: 0 = cooperation, 1 = defection. Also adding as column to dataset.
for (s in unique(numIG$SID)) {
j = numIG$SID == s # create a logical index
numDefCoop_IG[j] = aggregate(x = numIG, by = list(unique.numValues = numIG$Player_move), FUN = length)
#______defections----
numDef_IG = ifelse((length(numDefCoop_IG$unique.numValues) == 2) & (numDefCoop_IG$unique.numValues[2] == 1), numDefCoop_IG[2,2],
ifelse((length(numDefCoop_IG$unique.numValues)== 1) & (numDefCoop_IG[1] == 1), numDefCoop_IG[1,2], 0)[j])
df$numDef_IG[j]= numDef_IG
#______cooperations----
numCoop_IG= ifelse(numDefCoop_IG$unique.numValues[1] == 0, numDefCoop_IG[1,2], 0)[j]
numCoop_IG = ifelse((length(numDefCoop_IG$unique.numValues) == 2) & (numDefCoop_IG$unique.numValues[1] == 0), numDefCoop_IG[1,2],
ifelse((length(numDefCoop_IG$unique.numValues)== 1) & (numDefCoop_IG[1] == 0), numDefCoop_IG[1,2], 0))[j]
df$numCoop_IG[j]= numCoop_IG
}
}
View(df)