背景: 我有一个由计数数据组成的数据框,从数字图像“条形码”计数昆虫若虫的数量。 对于每个条形码,应该有两个数字/计数“NymphCounts”,对于“状态”活着的'a'和死'd'这两个级别中的每一个都有一个数字。
问题: 对于state = dead或alive,并不总是输入“零”计数数据。我想添加缺少的零数据。 如何检查每个条形码出现两次,如果条形码编号只出现一次,只复制条形码(观察)和所有其他变量,除了添加零“Nymph count”,但要确保“State”变量与记录的状态相反最初的观察。
通常,我可以从其他帖子中找到大多数解决方案。虽然很多帖子都很接近,但我似乎无法解决以下问题。
示例数据:
dput(sampleData)
structure(list(Barcode = structure(c(1L, 1L, 2L, 2L, 3L, 4L,
5L, 6L, 7L, 8L, 9L, 9L, 10L, 10L, 11L, 11L), .Label = c("10308",
"10309", "10310", "10326", "10327", "10328", "10329", "10330",
"10331", "10332", "10335"), class = "factor"), Trip = c(1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L), Source =
structure(c(1L,
1L, 2L, 2L, 11L, 3L, 4L, 5L, 6L, 7L, 8L, 8L, 9L, 9L, 10L, 10L
), .Label = c("P8030265.JPG", "P8030266.JPG", "P8040342.JPG",
"P8040343.JPG", "P8040344.JPG", "P8040345.JPG", "P8040346.JPG",
"P8040347.JPG", "P8040348.JPG", "P8040349.JPG", "P8040359.JPG"
), class = "factor"), NymphCounts = c(1L, 1L, 1L, 5L, 7L, 2L,
16L, 1L, 1L, 1L, 4L, 10L, 2L, 1L, 3L, 3L), ND.T = structure(c(1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L), .Label = "N/A",
class = "factor"),
ND.Z = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L), .Label = "N/A", class = "factor"),
ND.M = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L), .Label = "N/A", class = "factor"),
Phone = structure(c(2L, 2L, 2L, 2L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L), .Label = c("M ", "T"), class = "factor"),
State = structure(c(1L, 2L, 1L, 2L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 2L, 1L, 2L, 1L, 2L), .Label = c("a", "d"), class = "factor")), .Names =
c("Barcode",
"Trip", "Source", "NymphCounts", "ND.T", "ND.Z", "ND.M", "Phone",
"State"), row.names = c(NA, -16L), class = "data.frame")
寻找类似的问题并在一些帮助下尝试了这个
sampled<-sampleData
str(sampled)
for(i in levels(sampled[,1])){
if (sum(sampled[,1]==i)==1){
temp <-sampled[which(sampled[,1]==i),]
if(temp$State=="a") temp$State="d"
if(temp$State=="d") temp$State="a"
temp$NymphCounts=0
sampled <- rbind(sampled,temp)
}
}
write.csv(sampled,"R:/Leo/CWF DATA/sampledcor.csv")
sampledcor<-read.csv("R:/Leo/CWF DATA/sampledcor.csv", header = TRUE, sep =
",")
结果几乎和我想的一样,但不是很好........ 之前
sampleData
Barcode Trip Source NymphCounts ND.T ND.Z ND.M Phone State
1 10308 1 P8030265.JPG 1 N/A N/A N/A T a
2 10308 1 P8030265.JPG 1 N/A N/A N/A T d
3 10309 1 P8030266.JPG 1 N/A N/A N/A T a
4 10309 1 P8030266.JPG 5 N/A N/A N/A T d
5 10310 1 P8040359.JPG 7 N/A N/A N/A M a
6 10326 1 P8040342.JPG 2 N/A N/A N/A M d
7 10327 1 P8040343.JPG 16 N/A N/A N/A M a
AFTER
sampledcor
X Barcode Trip Source NymphCounts ND.T ND.Z ND.M Phone State
1 1 10308 1 P8030265.JPG 1 N/A N/A N/A T a
2 2 10308 1 P8030265.JPG 1 N/A N/A N/A T d
3 3 10309 1 P8030266.JPG 1 N/A N/A N/A T a
4 4 10309 1 P8030266.JPG 5 N/A N/A N/A T d
5 5 10310 1 P8040359.JPG 7 N/A N/A N/A M a
6 6 10326 1 P8040342.JPG 2 N/A N/A N/A M d
7 7 10327 1 P8040343.JPG 16 N/A N/A N/A M a
RESULTS AFTER If the present state is `d`dead it duplicates the `barcode` adds a `zero` count and adds the missing alternative state `a` Great!
6 6 10326 1 P8040342.JPG 2 N/A N/A N/A M d
18 61 10326 1 P8040342.JPG 0 N/A N/A N/A M a
但是,如果当前状态为a
有效,则代码重复条形码会增加zero
计数,但是也会重复状态,而不是将其更改为d
并非如此!< / p>
7 7 10327 1 P8040343.JPG 16 N/A N/A N/A M a
19 71 10327 1 P8040343.JPG 0 N/A N/A N/A M a
我很欣赏有关如何更正代码的任何指示,以便添加的观察值得到零计数,但也是适当的状态。 我还得到一个额外的列变量'X'有没有办法控制这个? 提前致谢