从多个变量中选择电话号码

时间:2019-04-01 14:10:08

标签: string stata

我将以下形式的电话号码存储在三个变量中:

(216) 438-9248     16) 438-9248    (216) 438-924  
    (9 9705104   (935) 970-5104            970-5  
(667) 218-1827   (667) 218-1827       (667)-1827  
(795) 653-6687   (795) 653-6687   (795) 653-6687  
(999) 301-5695    (999) 3015695     999 301-5695  
(585) 802-2542    (585) 82-2542     (585) 802-22  

如何在每次观察中选择最完整的电话号码?

1 个答案:

答案 0 :(得分:1)

您可以使用length()cond()函数来生成一个新变量wanted,该变量在所有其他三个变量的每次观察中保持最长的记录:

clear
input str15(var1 var2 var3)
"(216) 438-924" "16) 438-9248" "(216) 438-9248"
"(9 9705104" "(935) 970-5104" "970-5"
"(667) 218-1827" "(667) 218-1827" "(667)-1827"
"(795) 653-6687" "(795) 653-6687" "(795) 653-6687"
"(999) 301-5695" "(999) 3015695" "999 301-5695"
"(585) 802-2542" "(585) 82-2542" "(585) 802-22"
end

generate var4 = cond(length(var1) > length(var2), var1, var2)                  
generate wanted = cond(length(var4) > length(var3), var4, var3)
drop var4   

list, separator(0)

     +-------------------------------------------------------------------+
     |           var1             var2             var3           wanted |
     |-------------------------------------------------------------------|
  1. | (216) 438-9248     16) 438-9248    (216) 438-924   (216) 438-9248 |
  2. |     (9 9705104   (935) 970-5104            970-5   (935) 970-5104 |
  3. | (667) 218-1827   (667) 218-1827       (667)-1827   (667) 218-1827 |
  4. | (795) 653-6687   (795) 653-6687   (795) 653-6687   (795) 653-6687 |
  5. | (999) 301-5695    (999) 3015695     999 301-5695   (999) 301-5695 |
  6. | (585) 802-2542    (585) 82-2542     (585) 802-22   (585) 802-2542 |
     +-------------------------------------------------------------------+