Question

这主要是一个逻辑问题。

我正试图找出一群人服用药物的模式。我的第一步是找到4种药物的“持续使用者”。在第4种药物最初处方后，我将连续使用定义为4种药物的重复处方。

对于某些人来说，他们可能在开始第四次药物治疗后连续使用4种药物。我发现第四种药物（我感兴趣的第四种药物是B，Q，S和T），然后我看看这个人是否继续服用A + C + D +第四种药物的4种药物。这就是我对4种药物的处理方法（迷你数据集在下面，标记为4种吸毒者）;

bys id: gen interest=0  
by id: replace interest =1 if  (agent_type == "T" | agent_type =="Q" | agent_type =="S" | agent_type =="B") ///
    &  con_4_4 ==1 & count==4
by id: egen interest4=max(interest)   //notes: this variable tells me if the person has a 4th drug of interest to me; drug B, Q, S or T

gen acd_4_1=0
by id: replace acd_4_1 =1 if  (agent_type == "A"| agent_type =="C" | agent_type=="D") & count==1

gen acd_4_2=0
by id: replace acd_4_2 =1 if  (agent_type == "A"| agent_type =="C" | agent_type=="D") & count==2

gen acd_4_3=0
by id: replace acd_4_3 =1 if  (agent_type == "A"| agent_type =="C" | agent_type=="D") & count==3

 by id: egen acd_4_11 =max(acd_4_1)
 by id: egen acd_4_22 =max(acd_4_2)
 by id: egen acd_4_33 =max(acd_4_3)

 gen acd_4=1 if acd_4_11 ==1 & acd_4_22 ==1 & acd_4_33 ==1 & interest4==1  //acd_4 is a variable indicating whether people had the desired pattern after initiating their 4th agent

*notes: 
*kate has acd_4 = . because she used a prohibited drug "Q" and also her 4th drug was not of interest to us (was "A" as opposed to T, Q, S or B)
*mark has acd_4==1 because he used the correct pattern A+C+D after the prescription of his 4th drug which was S (count=4, date 5th October 2000)

现在，它变得更加棘手。其他可能正在转换药物或停药的人可能不会连续使用4种药物，直到他们的第5次药物或第6次药物治疗。例如，仅在第5次药物治疗后，他们才会重复使用A + C + D和药物5的处方，在这种情况下，这些药物是我们感兴趣的药物（同样，它将是B，Q，S或T）。

如果他们有另外的药物B，Q，S，T以及他们感兴趣的药物和感兴趣的模式 - 那么我想标记这一点，因为我想排除该人的模式进一步考虑。例如，我想要med5 + A + C + D而不是med5 + A + C + D + S.

我已经找到了一种方法（下面的迷你数据集，标记为＆＃34; 5drug用户＆＃34;），但我的代码很笨重，需要很长时间才能完成我的大数据集。任何人都可以提出一些建议：1）改进我的逻辑或2）改进我的编码，或3）两者！

gen interest5=0
 bys id: replace interest5 =1 if  (agent_type == "T" | agent_type =="Q" | agent_type =="S" | agent_type =="B") ///
    & con_5_5 ==1 & count==5
 by id: egen interest55 = max(interest5)
 drop interest5
 ren interest55 interest5


by id: gen A5=1 if (agent_type =="A") & (rx_date >fifth_con_full & rx_date <=fifth_con_full+180) & interest5==1
by id: egen AA55=max(A5)
drop A5

by id: gen C5=1 if (agent_type =="C") & (rx_date >fifth_con_full & rx_date <=fifth_con_full+180) & interest5==1
by id: egen C55=max(C5)
drop C5

by id: gen D5=1 if (agent_type =="D") & (rx_date >fifth_con_full & rx_date <=fifth_con_full+180) & interest5==1
by id: egen D55=max(D5)
drop D5

by id: gen acd_5=1 if (AA55==1 & C55==1 & D55==1) & interest5==1

*make sure patient isn't taking any of the other comparator agents
 by id: gen prohib=1 if (agent_type == "T" | agent_type =="Q" | agent_type =="S" | agent_type =="B") ///
        & (rx_date >fifth_con_full & rx_date <=fifth_con_full+180) & interest5==1 & count!=5        //here the count!=5 code indicates that I want stata to flag if the patient is taking any of the comparator agents, not inclusive ofthe compartor agent of interest, in this case the comparator agent is count==5
by id: egen prohib55=max(prohib)

by id: gen pattern=1 if acd_5 ==1 & prohib55 !=1

*notes: 
*mary has pattern = . because she used a prohibited drug "B" after the prescription of her 4th agent (here count=5, agent_type "T", starting on 29th July 05) 
*Pat has pattern=1 because he used A+C+D after his 4th agent (here count=5, agent-type==B, starting on 28th Jan 09)
*Sue has pattern=. because she used a prohibited drug "T" after the precription of her 4th agent (here count=5, agenttype==B, startig on 25th Feb 2011)

数据集

4drug用户

* Example generated by -dataex-. To install: ssc install dataex
clear
input str4 id int rx_date str1 agent_type byte count int fourth_full byte con_4_4 int fourth_con_full
"kate" 16728 "Q" 1     . 1 16733
"kate" 16728 "C" 3     . 1 16733
"kate" 16733 "A" 4 16733 1 16733
"kate" 16758 "B" 2 16733 1 16733
"kate" 16758 "Q" 1 16733 1 16733
"kate" 16758 "C" 3 16733 1 16733
"kate" 16762 "A" 4 16733 1 16733
"kate" 16784 "C" 3 16733 1 16733
"kate" 16784 "A" 4 16733 1 16733
"kate" 16784 "Q" 1 16733 1 16733
"kate" 16784 "B" 2 16733 1 16733
"kate" 16812 "Q" 1 16733 1 16733
"kate" 16812 "B" 2 16733 1 16733
"kate" 16812 "A" 4 16733 1 16733
"kate" 16812 "C" 3 16733 1 16733
"kate" 16841 "Q" 1 16733 1 16733
"kate" 16841 "C" 3 16733 1 16733
"kate" 16841 "B" 2 16733 1 16733
"mark" 14874 "C" 2     . 1 14888
"mark" 14874 "A" 1     . 1 14888
"mark" 14888 "S" 4 14888 1 14888
"mark" 14888 "D" 3 14888 1 14888
"mark" 14930 "S" 4 14888 1 14888
"mark" 14930 "C" 2 14888 1 14888
"mark" 14930 "A" 1 14888 1 14888
"mark" 14930 "D" 3 14888 1 14888
"mark" 14965 "S" 4 14888 1 14888
"mark" 14965 "A" 1 14888 1 14888
"mark" 14965 "D" 3 14888 1 14888
"mark" 14965 "C" 2 14888 1 14888
"mark" 15028 "S" 4 14888 1 14888
"mark" 15028 "C" 2 14888 1 14888
"mark" 15028 "A" 1 14888 1 14888
"mark" 15028 "D" 3 14888 1 14888
"mark" 15097 "C" 2 14888 1 14888
"mark" 15097 "A" 1 14888 1 14888
"mark" 15097 "D" 3 14888 1 14888
"mark" 15097 "S" 4 14888 1 14888
end
format %tddd-Mon-YY rx_date
format %tddd-Mon-YY fourth_full
format %tddd-Mon-YY fourth_con_full

5drug用户

* Example generated by -dataex-. To install: ssc install dataex
clear
input str4 id int rx_date str1 agent_type byte count int fifth_full byte con_5_5 int fifth_con_full
"pat"  17910 "D" 1     . 1 17925
"pat"  17910 "A" 4     . 1 17925
"pat"  17910 "C" 2     . 1 17925
"pat"  17925 "B" 5 17925 1 17925
"pat"  17948 "B" 5 17925 1 17925
"pat"  17969 "C" 2 17925 1 17925
"pat"  17969 "B" 5 17925 1 17925
"pat"  17969 "D" 1 17925 1 17925
"pat"  17969 "A" 4 17925 1 17925
"pat"  18028 "D" 1 17925 1 17925
"pat"  18028 "B" 5 17925 1 17925
"pat"  18028 "C" 2 17925 1 17925
"pat"  18028 "A" 4 17925 1 17925
"pat"  18081 "D" 1 17925 1 17925
"pat"  18081 "C" 2 17925 1 17925
"mary" 16618 "C" 2     . 1 16646
"mary" 16618 "D" 3     . 1 16646
"mary" 16618 "B" 1     . 1 16646
"mary" 16646 "T" 5 16646 1 16646
"mary" 16679 "A" 4 16646 1 16646
"mary" 16679 "C" 2 16646 1 16646
"mary" 16679 "D" 3 16646 1 16646
"mary" 16679 "B" 1 16646 1 16646
"mary" 16681 "T" 5 16646 1 16646
"mary" 16737 "D" 3 16646 1 16646
"mary" 16737 "B" 1 16646 1 16646
"mary" 16737 "A" 4 16646 1 16646
"sue"  18676 "D" 3     . 1 18683
"sue"  18676 "C" 2     . 1 18683
"sue"  18676 "T" 4     . 1 18683
"sue"  18683 "B" 5 18683 1 18683
"sue"  18729 "C" 2 18683 1 18683
"sue"  18729 "B" 5 18683 1 18683
"sue"  18729 "T" 4 18683 1 18683
"sue"  18729 "D" 3 18683 1 18683
"sue"  18730 "C" 2 18683 1 18683
"sue"  18779 "C" 2 18683 1 18683
"sue"  18779 "T" 4 18683 1 18683
"sue"  18779 "D" 3 18683 1 18683
"sue"  18826 "A" 1 18683 1 18683
"sue"  18834 "C" 2 18683 1 18683
"sue"  18834 "T" 4 18683 1 18683
"sue"  18834 "D" 3 18683 1 18683
"sue"  18889 "D" 3 18683 1 18683
end
format %tddd-Mon-YY rx_date
format %tddd-Mon-YY fifth_full
format %tddd-Mon-YY fifth_con_full

Answer 1

这不是一个答案，但它不适合评论。您的代码清晰但可以压缩。例如，第一个块可以简化为

gen interest = inlist(agent_type, "T", "Q", "S", "B") &  con_4_4 ==1 & count==4
bysort id: egen interest4 = max(interest)     

gen acd_4_1 = inlist(agent_type, "A", "C", "D") & count==1
gen acd_4_2 = inlist(agent_type, "A", "C", "D") & count==2
gen acd_4_3 = inlist(agent_type, "A", "C", "D") & count==3

by id: egen acd_4_11 = max(acd_4_1)
by id: egen acd_4_22 = max(acd_4_2)
by id: egen acd_4_33 = max(acd_4_3)

gen acd_4= 1 if acd_4_11 ==1 & acd_4_22 ==1 & acd_4_33 ==1 & interest4==1

那是13行到9行。

只是化妆品，但最重要的是你希望你的真正问题要清楚并得到回答。

那里的小技术包括

省略by:时对结果没有影响。
煮沸generate和replace对以在单个语句中生成0,1个变量。
使用inlist()简洁地捕捉替代方案。

更简单地重写问题会更有可能尝试解决您的真实问题。

确定药物使用的模式

4drug用户

5drug用户

1 个答案: