水平和垂直计算字母顺序

时间:2018-10-02 14:27:42

标签: stata

我在Stata中具有以下字符串变量:

"H" "D" "D" "T" "K" "J" "E" "F" "G" "B" "Z" "L" "N" "Q" "M" "H" "B" "D" "I" "C"
"S" "V" "T" "J" "Q" "I" "Y" "P" "R" "Q" "C" "U" "S" "Q" "U" "X" "G" "Y" "M" "F"
"U" "I" "G" "H" "X" "Q" "K" "T" "D" "Q" "Q" "P" "Q" "W" "J" "F" "C" "K" "Y" "F"
"X" "P" "Y" "Z" "R" "E" "Z" "W" "H" "X" "B" "D" "I" "I" "H" "Y" "N" "H" "N" "Q"
"S" "I" "U" "R" "W" "N" "U" "I" "G" "H" "I" "T" "A" "U" "J" "Z" "G" "W" "A" "L"
"J" "S" "I" "N" "B" "N" "R" "I" "I" "J" "C" "G" "T" "X" "K" "A" "U" "Z" "M" "V"
"Y" "V" "G" "R" "V" "U" "A" "E" "K" "S" "O" "U" "U" "P" "T" "Q" "I" "T" "C" "E"
"Q" "I" "H" "U" "I" "G" "H" "Y" "A" "U" "V" "R" "K" "E" "U" "I" "G" "H" "H" "N"
"U" "C" "S" "B" "K" "F" "U" "P" "I" "V" "D" "G" "F" "F" "A" "R" "H" "I" "Y" "B"
"M" "Z" "G" "E" "Z" "U" "A" "Z" "Y" "W" "H" "Q" "R" "T" "X" "E" "S" "T" "V" "V"

如何计算字母UIGH出现在可变网格中的次数?

1 个答案:

答案 0 :(得分:3)

您可以使用egen命令及其关联的功能concat()将所有字​​母放到一个变量中:

clear

forvalues i = 1 / 20 {
    local varlist `varlist' hvar`i'
}

input str1(`varlist')
"H" "D" "D" "T" "K" "J" "E" "F" "G" "B" "Z" "L" "N" "Q" "M" "H" "B" "D" "I" "C"
"S" "V" "T" "J" "Q" "I" "Y" "P" "R" "Q" "C" "U" "S" "Q" "U" "X" "G" "Y" "M" "F"
"U" "I" "G" "H" "X" "Q" "K" "T" "D" "Q" "Q" "P" "Q" "W" "J" "F" "C" "K" "Y" "F"
"X" "P" "Y" "Z" "R" "E" "Z" "W" "H" "X" "B" "D" "I" "I" "H" "Y" "N" "H" "N" "Q"
"S" "I" "U" "R" "W" "N" "U" "I" "G" "H" "I" "T" "A" "U" "J" "Z" "G" "W" "A" "L"
"J" "S" "I" "N" "B" "N" "R" "I" "I" "J" "C" "G" "T" "X" "K" "A" "U" "Z" "M" "V"
"Y" "V" "G" "R" "V" "U" "A" "E" "K" "S" "O" "U" "U" "P" "T" "Q" "I" "T" "C" "E"
"Q" "I" "H" "U" "I" "G" "H" "Y" "A" "U" "V" "R" "K" "E" "U" "I" "G" "H" "H" "N"
"U" "C" "S" "B" "K" "F" "U" "P" "I" "V" "D" "G" "F" "F" "A" "R" "H" "I" "Y" "B"
"M" "Z" "G" "E" "Z" "U" "A" "Z" "Y" "W" "H" "Q" "R" "T" "X" "E" "S" "T" "V" "V"
end

egen varh = concat(hvar*)
list varh

     +----------------------+
     |                 varh |
     |----------------------|
  1. | HDDTKJEFGBZLNQMHBDIC |
  2. | SVTJQIYPRQCUSQUXGYMF |
  3. | UIGHXQKTDQQPQWJFCKYF |
  4. | XPYZREZWHXBDIIHYNHNQ |
  5. | SIURWNUIGHITAUJZGWAL |
     |----------------------|
  6. | JSINBNRIIJCGTXKAUZMV |
  7. | YVGRVUAEKSOUUPTQITCE |
  8. | QIHUIGHYAUVRKEUIGHHN |
  9. | UCSBKFUPIVDGFFARHIYB |
 10. | MZGEZUAZYWHQRTXESTVV |
     +----------------------+

然后,您需要结合使用length()subinstr()函数 计算然后显示字母序列的次数:

generate hUIGH = (length(varh) - length(subinstr(varh, "UIGH", "", .))) / 4
list varh hUIGH

     +------------------------------+
     |                 varh   hUIGH |
     |------------------------------|
  1. | HDDTKJEFGBZLNQMHBDIC       0 |
  2. | SVTJQIYPRQCUSQUXGYMF       0 |
  3. | UIGHXQKTDQQPQWJFCKYF       1 |
  4. | XPYZREZWHXBDIIHYNHNQ       0 |
  5. | SIURWNUIGHITAUJZGWAL       1 |
     |------------------------------|
  6. | JSINBNRIIJCGTXKAUZMV       0 |
  7. | YVGRVUAEKSOUUPTQITCE       0 |
  8. | QIHUIGHYAUVRKEUIGHHN       2 |
  9. | UCSBKFUPIVDGFFARHIYB       0 |
 10. | MZGEZUAZYWHQRTXESTVV       0 |
     +------------------------------+

要在垂直方向执行相同操作,首先需要转置变量。

一点点mata魔术就能轻松实现:

putmata X = (hvar*), replace
mata: X = X'
getmata (vvar*) = X, force

list vvar*

     +--------------------------------------------------------------------------------+
     | vvar1   vvar2   vvar3   vvar4   vvar5   vvar6   vvar7   vvar8   vvar9   vvar10 |
     |--------------------------------------------------------------------------------|
  1. |     H       S       U       X       S       J       Y       Q       U        M |
  2. |     D       V       I       P       I       S       V       I       C        Z |
  3. |     D       T       G       Y       U       I       G       H       S        G |
  4. |     T       J       H       Z       R       N       R       U       B        E |
  5. |     K       Q       X       R       W       B       V       I       K        Z |
     |--------------------------------------------------------------------------------|
  6. |     J       I       Q       E       N       N       U       G       F        U |
  7. |     E       Y       K       Z       U       R       A       H       U        A |
  8. |     F       P       T       W       I       I       E       Y       P        Z |
  9. |     G       R       D       H       G       I       K       A       I        Y |
 10. |     B       Q       Q       X       H       J       S       U       V        W |
     |--------------------------------------------------------------------------------|
 11. |     Z       C       Q       B       I       C       O       V       D        H |
 12. |     L       U       P       D       T       G       U       R       G        Q |
 13. |     N       S       Q       I       A       T       U       K       F        R |
 14. |     Q       Q       W       I       U       X       P       E       F        T |
 15. |     M       U       J       H       J       K       T       U       A        X |
     |--------------------------------------------------------------------------------|
 16. |     H       X       F       Y       Z       A       Q       I       R        E |
 17. |     B       G       C       N       G       U       I       G       H        S |
 18. |     D       Y       K       H       W       Z       T       H       I        T |
 19. |     I       M       Y       N       A       M       C       H       Y        V |
 20. |     C       F       F       Q       L       V       E       N       B        V |
     +--------------------------------------------------------------------------------+

最后,您只需对新变量再次重复该过程:

egen vvar = concat(vvar*)
generate vUIGH = (length(vvar) - length(subinstr(vvar, "UIGH", "", .))) / 4

list vvar vUIGH

     +--------------------+
     |       vvar   vUIGH |
     |--------------------|
  1. | HSUXSJYQUM       0 |
  2. | DVIPISVICZ       0 |
  3. | DTGYUIGHSG       1 |
  4. | TJHZRNRUBE       0 |
  5. | KQXRWBVIKZ       0 |
     |--------------------|
  6. | JIQENNUGFU       0 |
  7. | EYKZURAHUA       0 |
  8. | FPTWIIEYPZ       0 |
  9. | GRDHGIKAIY       0 |
 10. | BQQXHJSUVW       0 |
     |--------------------|
 11. | ZCQBICOVDH       0 |
 12. | LUPDTGURGQ       0 |
 13. | NSQIATUKFR       0 |
 14. | QQWIUXPEFT       0 |
 15. | MUJHJKTUAX       0 |
     |--------------------|
 16. | HXFYZAQIRE       0 |
 17. | BGCNGUIGHS       1 |
 18. | DYKHWZTHIT       0 |
 19. | IMYNAMCHYV       0 |
 20. | CFFQLVENBV       0 |
     +--------------------+