我在Stata中具有以下字符串变量:
"H" "D" "D" "T" "K" "J" "E" "F" "G" "B" "Z" "L" "N" "Q" "M" "H" "B" "D" "I" "C"
"S" "V" "T" "J" "Q" "I" "Y" "P" "R" "Q" "C" "U" "S" "Q" "U" "X" "G" "Y" "M" "F"
"U" "I" "G" "H" "X" "Q" "K" "T" "D" "Q" "Q" "P" "Q" "W" "J" "F" "C" "K" "Y" "F"
"X" "P" "Y" "Z" "R" "E" "Z" "W" "H" "X" "B" "D" "I" "I" "H" "Y" "N" "H" "N" "Q"
"S" "I" "U" "R" "W" "N" "U" "I" "G" "H" "I" "T" "A" "U" "J" "Z" "G" "W" "A" "L"
"J" "S" "I" "N" "B" "N" "R" "I" "I" "J" "C" "G" "T" "X" "K" "A" "U" "Z" "M" "V"
"Y" "V" "G" "R" "V" "U" "A" "E" "K" "S" "O" "U" "U" "P" "T" "Q" "I" "T" "C" "E"
"Q" "I" "H" "U" "I" "G" "H" "Y" "A" "U" "V" "R" "K" "E" "U" "I" "G" "H" "H" "N"
"U" "C" "S" "B" "K" "F" "U" "P" "I" "V" "D" "G" "F" "F" "A" "R" "H" "I" "Y" "B"
"M" "Z" "G" "E" "Z" "U" "A" "Z" "Y" "W" "H" "Q" "R" "T" "X" "E" "S" "T" "V" "V"
如何计算字母UIGH
出现在可变网格中的次数?
答案 0 :(得分:3)
您可以使用egen
命令及其关联的功能concat()
将所有字母放到一个变量中:
clear
forvalues i = 1 / 20 {
local varlist `varlist' hvar`i'
}
input str1(`varlist')
"H" "D" "D" "T" "K" "J" "E" "F" "G" "B" "Z" "L" "N" "Q" "M" "H" "B" "D" "I" "C"
"S" "V" "T" "J" "Q" "I" "Y" "P" "R" "Q" "C" "U" "S" "Q" "U" "X" "G" "Y" "M" "F"
"U" "I" "G" "H" "X" "Q" "K" "T" "D" "Q" "Q" "P" "Q" "W" "J" "F" "C" "K" "Y" "F"
"X" "P" "Y" "Z" "R" "E" "Z" "W" "H" "X" "B" "D" "I" "I" "H" "Y" "N" "H" "N" "Q"
"S" "I" "U" "R" "W" "N" "U" "I" "G" "H" "I" "T" "A" "U" "J" "Z" "G" "W" "A" "L"
"J" "S" "I" "N" "B" "N" "R" "I" "I" "J" "C" "G" "T" "X" "K" "A" "U" "Z" "M" "V"
"Y" "V" "G" "R" "V" "U" "A" "E" "K" "S" "O" "U" "U" "P" "T" "Q" "I" "T" "C" "E"
"Q" "I" "H" "U" "I" "G" "H" "Y" "A" "U" "V" "R" "K" "E" "U" "I" "G" "H" "H" "N"
"U" "C" "S" "B" "K" "F" "U" "P" "I" "V" "D" "G" "F" "F" "A" "R" "H" "I" "Y" "B"
"M" "Z" "G" "E" "Z" "U" "A" "Z" "Y" "W" "H" "Q" "R" "T" "X" "E" "S" "T" "V" "V"
end
egen varh = concat(hvar*)
list varh
+----------------------+
| varh |
|----------------------|
1. | HDDTKJEFGBZLNQMHBDIC |
2. | SVTJQIYPRQCUSQUXGYMF |
3. | UIGHXQKTDQQPQWJFCKYF |
4. | XPYZREZWHXBDIIHYNHNQ |
5. | SIURWNUIGHITAUJZGWAL |
|----------------------|
6. | JSINBNRIIJCGTXKAUZMV |
7. | YVGRVUAEKSOUUPTQITCE |
8. | QIHUIGHYAUVRKEUIGHHN |
9. | UCSBKFUPIVDGFFARHIYB |
10. | MZGEZUAZYWHQRTXESTVV |
+----------------------+
然后,您需要结合使用length()
和subinstr()
函数
计算然后显示字母序列的次数:
generate hUIGH = (length(varh) - length(subinstr(varh, "UIGH", "", .))) / 4
list varh hUIGH
+------------------------------+
| varh hUIGH |
|------------------------------|
1. | HDDTKJEFGBZLNQMHBDIC 0 |
2. | SVTJQIYPRQCUSQUXGYMF 0 |
3. | UIGHXQKTDQQPQWJFCKYF 1 |
4. | XPYZREZWHXBDIIHYNHNQ 0 |
5. | SIURWNUIGHITAUJZGWAL 1 |
|------------------------------|
6. | JSINBNRIIJCGTXKAUZMV 0 |
7. | YVGRVUAEKSOUUPTQITCE 0 |
8. | QIHUIGHYAUVRKEUIGHHN 2 |
9. | UCSBKFUPIVDGFFARHIYB 0 |
10. | MZGEZUAZYWHQRTXESTVV 0 |
+------------------------------+
要在垂直方向执行相同操作,首先需要转置变量。
一点点mata
魔术就能轻松实现:
putmata X = (hvar*), replace
mata: X = X'
getmata (vvar*) = X, force
list vvar*
+--------------------------------------------------------------------------------+
| vvar1 vvar2 vvar3 vvar4 vvar5 vvar6 vvar7 vvar8 vvar9 vvar10 |
|--------------------------------------------------------------------------------|
1. | H S U X S J Y Q U M |
2. | D V I P I S V I C Z |
3. | D T G Y U I G H S G |
4. | T J H Z R N R U B E |
5. | K Q X R W B V I K Z |
|--------------------------------------------------------------------------------|
6. | J I Q E N N U G F U |
7. | E Y K Z U R A H U A |
8. | F P T W I I E Y P Z |
9. | G R D H G I K A I Y |
10. | B Q Q X H J S U V W |
|--------------------------------------------------------------------------------|
11. | Z C Q B I C O V D H |
12. | L U P D T G U R G Q |
13. | N S Q I A T U K F R |
14. | Q Q W I U X P E F T |
15. | M U J H J K T U A X |
|--------------------------------------------------------------------------------|
16. | H X F Y Z A Q I R E |
17. | B G C N G U I G H S |
18. | D Y K H W Z T H I T |
19. | I M Y N A M C H Y V |
20. | C F F Q L V E N B V |
+--------------------------------------------------------------------------------+
最后,您只需对新变量再次重复该过程:
egen vvar = concat(vvar*)
generate vUIGH = (length(vvar) - length(subinstr(vvar, "UIGH", "", .))) / 4
list vvar vUIGH
+--------------------+
| vvar vUIGH |
|--------------------|
1. | HSUXSJYQUM 0 |
2. | DVIPISVICZ 0 |
3. | DTGYUIGHSG 1 |
4. | TJHZRNRUBE 0 |
5. | KQXRWBVIKZ 0 |
|--------------------|
6. | JIQENNUGFU 0 |
7. | EYKZURAHUA 0 |
8. | FPTWIIEYPZ 0 |
9. | GRDHGIKAIY 0 |
10. | BQQXHJSUVW 0 |
|--------------------|
11. | ZCQBICOVDH 0 |
12. | LUPDTGURGQ 0 |
13. | NSQIATUKFR 0 |
14. | QQWIUXPEFT 0 |
15. | MUJHJKTUAX 0 |
|--------------------|
16. | HXFYZAQIRE 0 |
17. | BGCNGUIGHS 1 |
18. | DYKHWZTHIT 0 |
19. | IMYNAMCHYV 0 |
20. | CFFQLVENBV 0 |
+--------------------+