Stata:如果所有观察都是唯一的,请跳过代码

时间:2016-01-07 14:14:22

标签: stata

我有一个数据集,告诉我每位全科医生(GP)为每家医院提供的推荐人数。

如果数据中至少有1个GP将患者转诊到两个(或更多个)不同的医院,那么我想运行一些额外的代码,否则我不会。

我正在使用此代码:

set more off
gsort GP -referrals
by code: gen nvals = _n ==1
generate obs = _N

if nvals != obs {
display "different number of unique observations as total observations-therefore I will run additional code here"
continue
}
display "same number of unique observations as total observations-therefore for this loop I don't wish to run additional code"

目前这似乎不起作用。

有人可以帮我开发这段代码吗?即因此,如果观察总数等于唯一观察的总数,我知道我可以跳过下一段代码 - 这将是我目前所拥有的:

display "different number of unique observations as total observations-therefore I will run additional code here"

3 个答案:

答案 0 :(得分:1)

一个简单的解决方案是let main argv = //calculates the prime factors of a number let findPrimeFactors x = let primes = [|2I;3I;5I;7I;11I;13I;17I;19I|] let rec loop acc counter = function | x when x = 1I -> failwith "A PRIME IS BY DEFINITION GREATER THAN 1" | x when primes |> Array.contains x -> x :: acc | x when counter = primes.Length -> failwith "MY LIST OF KNOWN PRIMES IS NOT BIG ENOUGH" | x when x%primes.[counter]=0I-> loop (primes.[counter]::acc) (counter) (x/primes.[counter]) | x -> loop acc (counter + 1) x let primeFactor = loop [] 0 x |> List.rev primeFactor //calculates the prime factors for each of the numbers between 2 and n //then, for each of the prime factorizations it tries to find the highest power for each occurring prime factor let findPrimeFactorsPowers n = //builds a map of all the prime factor powers for all prime factorizations let rec addCounterFactorPowers factorPowers = function | counter when counter = n -> factorPowers | (counter : int) -> addCounterFactorPowers ((findPrimeFactors (counter|>bigint) |> List.countBy (fun x-> x)) @ factorPowers) (counter + 1) let allFactorPowers = addCounterFactorPowers [] 2 //group all the powers per prime factor let groupedFactorPowers = allFactorPowers |> List.groupBy (fun (factor, power) -> factor) //get the highest power per prime factor let maxFactorPowers = groupedFactorPowers |> List.map (fun (key, powers) -> (key, powers |> List.map (fun (factor, power) -> power) |> List.max)) //return the result maxFactorPowers let n = 20; let primeFactorSet = findPrimeFactorsPowers n printfn "%A" primeFactorSet let smallestNumberDivisableByAllNumbersBelown = (primeFactorSet |> List.fold (fun state (factor, power) -> state * pown factor power) 1I) printfn "Result %A" smallestNumberDivisableByAllNumbersBelown System.Console.ReadKey(true)|>ignore 0 // return an integer exit code ,并与isid结合使用。例如,capture数据集由auto变量唯一标识,但我们可以生成一个非唯一的make变量来说明这个想法:

manufacturer

答案 1 :(得分:1)

这里有两个相关的问题。我将它们分开:

<强> 1。命令if(与if限定符不同)

这种构造

if nvals != obs { 
...
} 

可能是Stata错误的主要来源,通常会咬住那些习惯于在其他软件中以特定方式解释的人。

如果要比较的两个项目是标量或宏,那么一切都很好。 (如果无法比较这两个项目,那么Stata会抱怨,但这不是问题,只是简单地令人费解。)

如果两个项目都是变量,则可能会出现问题,在您的问题和答案中都是如此。 Stata不会将此构造视为隐性循环,因此每次观察都会重复做出决定。相反,Stata总是将其解释为(在本例中)

if nvals[1] != obs[1] { 
...
} 

因此,一般来说,Stata会在第一次观察中查看变量的值,而只观察该观察值。如果变量在观察中实际上是不变的,那么一切都会很好;否则代码将作为合法运行,但可能会给出至少令人费解并经常出错的答案。

这个陷阱是can be seen here的常见问题解答。

<强> 2。不同的值

另一个问题是,在任何极端情况下,问题中的代码都不会产生不同的(您说&#34;唯一&#34;)值的数量。您没有提供可重现的示例,但在任何数据集中都提供了变量code

by code: gen nvals = _n ==1
generate obs = _N

将生成一个值为1或0的变量nvals和另一个包含观察数量的变量obs。如果且仅当整个数据集中只有一个观察值并且计算在任何其他情况下没有说明不同的值时,两者将是相等的。虽然你可能意识到,对线程感兴趣的人应该对逻辑感兴趣。

代码

by code: gen nvals = _n ==1
generate firststep = sum(nvals)
egen unique = max(firststep)

会计算code的不同值的数量。由于这是一个标量,更简单的方法可能是

by code: gen nvals = _n ==1
count if nvals == 1 
scalar unique = r(N) 

无需创建两个额外变量firststepunique。变量obs也是多余的,因为if语句可能只是

if unique != _N 

要对此问题进行审核,包括对术语&#34; unique&#34;的评论,请参阅this paper。如果您对Stata中的代码search distinct感兴趣,可以找到最新版本。

<强> 1。和2.一起

现在应该清楚的是,您自己的答案中的if命令将按预期工作,因为有问题的变量在构造时是不变的。

答案 2 :(得分:0)

我找到了解决方案,还有更多的游戏。

set more off
gsort GP -referrals
by code: gen nvals = _n ==1
generate firststep = sum(nvals)
egen unique = max(firststep)
generate obs = _N

if unique != obs {
display "different number of unique observations as total observations-therefore I will run additional code here"
continue
}
display "same number of unique observations as total observations-therefore for this loop I don't wish to run additional code"

似乎正在运作