我有一个名为full的数据集,其中一个Column是Breed,如下所示。
Breed
Shetland Sheepdog Mix
Domestic Shorthair Mix
Pit Bull Mix
Domestic Shorthair Mix
Lhasa Apso/Miniature Poodle
Cairn Terrier/Chihuahua Shorthair
Domestic Shorthair Mix
Domestic Shorthair Mix
American Pit Bull Terrier Mix
Cairn Terrier
Domestic Shorthair Mix
Miniature Schnauzer Mix
Pit Bull Mix
Yorkshire Terrier Mix
Great Pyrenees Mix
Domestic Shorthair Mix
Domestic Shorthair Mix
Pit Bull Mix
Angora Mix
Flat Coat Retriever Mix
Queensland Heeler Mix
Domestic Shorthair Mix
Plott Hound/Boxer
我需要的是,
我需要获取列中每个唯一值的频率。
我已经提取了BreedType和频率,如下所示。 (品种列的名称为BreedType) 然后,如果每个BreedType的频率小于66,则使用if条件我需要有一个带有'F'的新列,如果大于66,则需要为该列赋值'Breedtype'。
为品种频率低于66的品种值指定FALSE。
df$Breed <- data.frame(full$Breed)
setDT(df)
dt1 <- copy(df)
dt1[, c("Frequency", "TrueFalse") := .(.N, ifelse(.N < 66, "FALSE", Breed)), by = Breed]
dt1<-data.frame(dt1)
但是我的结果集得到了这样的答案,并显示错误。
[.data.table
中的错误(dt1 ,, :=
(c(“频率”,“TrueFalse”),.(。N,:
RHS的类型('整数')必须与LHS('字符')匹配。对于最快的情况,检查和强制会对性能产生太大影响。更改目标列的类型,或强制RHS:=自己(例如,使用1L而不是1)
我试了好几次但是我无法看到结果。有人可以帮忙吗
同样,当再次使用完整的$ Breed时,结果集看起来像这样。而不是预期,但频率正确,
df$Breed <- data.frame(full$Breed)
setDT(df)
dt1 <- copy(df)
dt1[, c("Frequency", "TrueFalse") := .(.N, ifelse(.N < 66, "FALSE", full$Breed)), by = full$Breed]
dt1<-data.frame(dt1)
Full<-cbind2(dt1, full)
有人可以帮忙找出问题所在!
答案 0 :(得分:0)
您可以使用dplyr:
library(dplyr)
df%>%group_by(Breed)%>%summarize(Frequency=n())%>%mutate(TrueFalse=ifelse(Frequency<66,"F",as.character(Breed)))
导致:
Source: local data frame [14 x 3]
Breed Frequency TrueFalse
<fctr> <int> <chr>
1 American Pit Bull Terrier Mix 4 F
2 Angora Mix 2 F
3 Cairn Terrier 4 F
4 Cairn Terrier/Chihuahua Shorthair 4 F
5 Domestic Shorthair Mix 519 Domestic Shorthair Mix
6 Flat Coat Retriever Mix 2 F
7 Great Pyrenees Mix 4 F
8 Lhasa Apso/Miniature Poodle 4 F
9 Miniature Schnauzer Mix 4 F
10 Pit Bull Mix 10 F
11 Plott Hound/Boxer 73 Plott Hound/Boxer
12 Queensland Heeler Mix 2 F
13 Yorkshire Terrier Mix 4 F
14 Shetland Sheepdog Mix 75 Shetland Sheepdog Mix
其中df是:
df<-structure(list(Breed = structure(c(14L, 5L, 10L, 5L, 8L, 4L,
5L, 5L, 1L, 3L, 5L, 9L, 10L, 13L, 7L, 5L, 5L, 10L, 2L, 6L, 12L,
5L, 11L, 14L, 5L, 10L, 5L, 8L, 4L, 5L, 5L, 1L, 3L, 5L, 9L, 10L,
13L, 7L, 5L, 5L, 10L, 2L, 6L, 12L, 5L, 11L, 14L, 5L, 10L, 5L,
8L, 4L, 5L, 5L, 1L, 3L, 5L, 9L, 10L, 13L, 7L, 14L, 5L, 10L, 5L,
8L, 4L, 5L, 5L, 1L, 3L, 5L, 9L, 10L, 13L, 7L, 5L, 11L, 14L, 5L,
5L, 11L, 14L, 5L, 5L, 11L, 14L, 5L, 5L, 11L, 14L, 5L, 5L, 11L,
14L, 5L, 5L, 11L, 14L, 5L, 5L, 11L, 14L, 5L, 5L, 11L, 14L, 5L,
5L, 11L, 14L, 5L, 5L, 11L, 14L, 5L, 5L, 11L, 14L, 5L, 5L, 11L,
14L, 5L, 5L, 11L, 14L, 5L, 5L, 11L, 14L, 5L, 5L, 11L, 14L, 5L,
5L, 11L, 14L, 5L, 5L, 11L, 14L, 5L, 5L, 11L, 14L, 5L, 5L, 11L,
14L, 5L, 5L, 11L, 14L, 5L, 5L, 11L, 14L, 5L, 5L, 11L, 14L, 5L,
5L, 11L, 14L, 5L, 5L, 11L, 14L, 5L, 5L, 11L, 14L, 5L, 5L, 11L,
14L, 5L, 5L, 11L, 14L, 5L, 5L, 11L, 14L, 5L, 5L, 11L, 14L, 5L,
5L, 11L, 14L, 5L, 5L, 11L, 14L, 5L, 5L, 11L, 14L, 5L, 5L, 11L,
14L, 5L, 5L, 11L, 14L, 5L, 5L, 11L, 14L, 5L, 5L, 11L, 14L, 5L,
5L, 11L, 14L, 5L, 5L, 11L, 14L, 5L, 5L, 11L, 14L, 5L, 5L, 11L,
14L, 5L, 5L, 11L, 14L, 5L, 5L, 11L, 14L, 5L, 5L, 11L, 14L, 5L,
5L, 11L, 14L, 5L, 5L, 11L, 14L, 5L, 5L, 11L, 14L, 5L, 5L, 11L,
14L, 5L, 5L, 11L, 14L, 5L, 5L, 11L, 14L, 5L, 5L, 11L, 14L, 5L,
5L, 11L, 14L, 5L, 5L, 11L, 14L, 5L, 5L, 11L, 14L, 5L, 5L, 11L,
14L, 5L, 5L, 11L, 14L, 5L, 5L, 11L, 14L, 5L, 5L, 11L, 14L, 5L,
5L, 11L, 14L, 5L, 5L, 11L, 14L, 5L, 5L, 11L, 14L, 5L, 5L, 11L,
14L, 5L, 5L, 11L, 14L, 5L, 5L, 11L, 14L, 5L, 5L, 11L, 14L, 5L,
5L, 11L, 14L, 5L, 5L, 11L, 14L, 5L, 5L, 11L, 14L, 5L, 5L, 11L,
14L, 5L, 5L, 11L, 14L, 5L, 5L, 11L, 14L, 5L, 5L, 11L, 14L, 5L,
5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L,
5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L,
5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L,
5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L,
5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L,
5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L,
5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L,
5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L,
5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L,
5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L,
5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L,
5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L,
5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L,
5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L,
5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L,
5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L,
5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L,
5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L,
5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L,
5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L,
5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L,
5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L), .Label = c(" American Pit Bull Terrier Mix",
" Angora Mix", " Cairn Terrier", " Cairn Terrier/Chihuahua Shorthair",
" Domestic Shorthair Mix", " Flat Coat Retriever Mix", " Great Pyrenees Mix",
" Lhasa Apso/Miniature Poodle", " Miniature Schnauzer Mix", " Pit Bull Mix",
" Plott Hound/Boxer", " Queensland Heeler Mix", " Yorkshire Terrier Mix",
"Shetland Sheepdog Mix"), class = "factor")), .Names = "Breed", class = "data.frame", row.names = c(NA,
-711L))