我正在尝试使用变量来计算预设列表中每个字符串在我的数据中出现的次数。我尝试的想法是有一个双循环,迭代每个字符串,然后在所有数据中运行该字符串,如果出现子字符串,则递增计数器。
statesList <- list("AL","AR","AZ","CA","CO","CT","DE","FL","GA","IA","ID", "IL","IN","KS","KY","LA","MA","MD","ME","MI","MN","MO","MS","MT","NC","ND","NE","NH","NJ","NM","NV","NY", "OH","OK","OR","PA","RI","SC","SD","TN","TX","UT","VA","VT","WA","WI","WV","WY")
statesAmount <- list()
for(state in statesList)
{
x <- 0
for(values in flu["location"])
{
if(grepl(state,values))
x <- x + 1
}
statesAmount[[state]] <- x
}
我遇到的问题是增量变量&#34; x&#34; 会在更改时将值应用于所有条目。 R是否有类似于&#34; new&#34; 关键字的东西,以避免这种情况,或者通常最适合这种情况的方法。我以后还可以将此列表转换为数据框吗?
编辑:
location
1 Fort Worth, TX
2 Washington, D.C.
3 Boston, MA
4 Annapolis, MD
5 Brooklyn, NY
6 New York, NY
答案 0 :(得分:0)
使用您的位置exmaple:您可以使用grep的长度来了解出现的次数,并使用lapply而不是for循环来直接创建列表:
statesAmount <-data.frame( count = sapply(statesList,function(state){
length(grep(state,location))
}), state = unlist(statesList))
count state
1 0 AL
2 0 AR
3 0 AZ
4 0 CA
5 0 CO
6 0 CT
7 0 DE
8 0 FL
9 0 GA
10 0 IA
11 0 ID
12 0 IL
13 0 IN
14 0 KS
15 0 KY
16 0 LA
17 1 MA
18 1 MD
19 0 ME
20 0 MI
21 0 MN
22 0 MO
23 0 MS
24 0 MT
25 0 NC
26 0 ND
27 0 NE
28 0 NH
29 0 NJ
30 0 NM
31 0 NV
32 2 NY
33 0 OH
34 0 OK
35 0 OR
36 0 PA
37 0 RI
38 0 SC
39 0 SD
40 0 TN
41 1 TX
42 0 UT
43 0 VA
44 0 VT
45 0 WA
46 0 WI
47 0 WV
48 0 WY