我有一个文件记录婚姻状况的变化-身份证,变化的类型(婚姻,离婚,丧偶)和变化的年份(和月份)。我想计算每个年份的每个人的婚姻状况(已婚,离婚,丧偶,从未结婚)。由于一个人可以进行很多更改,并且我的文件大约有2000万行,因此我想在找到答案时跳到下一个人,而不继续浏览该人的所有其他记录。
我想按ID和更改的降序排序,然后按ID设置。对于每个ID,如果我感兴趣的年份大于(或等于)变更年份,则计算婚姻状况并输出ID和婚姻状况。如果不是,请继续下一个记录,直到满足条件为止。如果没有记录满足条件,则婚姻状况=从未结婚。
data a;
length type_change $10;
input ID type_change yr_change mnth_change;
cards;
1 marriage 2006 9
1 divorce 2010 5
10 marriage 2005 2
10 divorce 2012 10
10 marriage 2016 8
23 marriage 2017 6
35 marriage 2002 7
35 widow 2013 12
;
run;
我希望在2015年获得: -ID marital_status -1对离婚 -离婚10 -23未婚 -35个丧偶者
谢谢!
答案 0 :(得分:1)
然后使用保留语句。
提取所有ID:
proc sort data=a out=ids(keep= id) nodupkey ;
by id;
run;
为所有ID生成所有年
data years;
set ids;
must_be_date=2000;
do i = 1 to 20;
must_be_date+1;
output;
end;
drop i;
run;
按条件加入:
proc sql;
create table res as
select *
from years left join a on years.must_be_date = a.yr_change and a.id = years.id
;
run;
proc sort ;
by id must_be_date;
run;
使用保留:
data res;
retain temp "never been married";
set res;
by id must_be_date;
if first.id then temp="never been married";
if type_change="" then type_change = temp;
else temp=type_change;
run;
检查:
data res_2015;
set res;
where must_be_date=2015;
run;
结果表:
+--------------------+----+--------------+-------------+-----------+-------------+
| temp | ID | must_be_date | type_change | yr_change | mnth_change |
+--------------------+----+--------------+-------------+-----------+-------------+
| divorce | 1 | 2015 | divorce | . | . |
| divorce | 10 | 2015 | divorce | . | . |
| never been married | 23 | 2015 | never been | . | . |
| widow | 35 | 2015 | widow | . | . |
+--------------------+----+--------------+-------------+-----------+-------------+
答案 1 :(得分:1)
from tkinter import *
import tkinter.messagebox
from tkinter import Canvas
from winsound import*
def save_data():
try:
fileD = open("deliveries.txt", "a")
fileD.write("TYPE:\n")
fileD.write("%s\n" % TYPE.get())
fileD.write("Description:\n")
fileD.write("%s\n" % description.get())
fileD.write("SERIAL_NUMBER:\n")
fileD.write("%s\n" % SERIAL_NUMBER.get())
fileD.write("FOLIO_NUMBER:\n")
fileD.write("%s\n" % FOLIO_NUMBER.get())
fileD.write("OTHER_DETAILS:\n")
fileD.write("%s\n" % OTHER_DETAILS.get("1.0", END))
TYPE.set("-SELECT-")
description.delete(0, END)
description.delete(0, END)
SERIAL_NUMBER.delete(0, END)
FOLIO_NUMBER.delete(0, END)
OTHER_DETAILS.delete("1.0", END)
except Exception as ex:
tkinter.messagebox.showerror("Error!", "Can't write to the file\n %s" % ex)
#This causes a dialogue box pop up when a certain function has failed to be executed
app = Tk()
app.title('DATA STORE AND SEARCH ENGINE')
app.iconbitmap(r'C:\Users\Carson1\Desktop\iconbb.ico')
app.configure(bg="indigo")
Label(app, text = "TYPE:", bg="blue").pack()
TYPE = StringVar()
TYPE.set(None)
TYPE = StringVar()
TYPE.set(None)
TYPE = StringVar()
TYPE.set(None)
TYPE = StringVar()
TYPE.set(None)
TYPE = StringVar()
TYPE.set(None)
OptionMenu(app, TYPE, "Title Deeds", "Cause Lists", "Case Files", "File Records","Other Documents").pack()
Label(app, text = "DESCRIPTION", bg="blue").pack()
description = Entry(app)
description.pack()
description.configure(bg="cyan")
Label(app, text = "SERIAL NUMBER", bg="blue").pack()
SERIAL_NUMBER = Entry(app)
SERIAL_NUMBER.pack()
SERIAL_NUMBER.configure(bg="cyan")
Label(app, text = "FOLIO NUMBER", bg="blue").pack()
FOLIO_NUMBER = Entry(app)
FOLIO_NUMBER.pack()
FOLIO_NUMBER.configure(bg="cyan")
Label(app, text = "OTHER DETAILS:", bg="blue").pack()
OTHER_DETAILS= Text(app)
OTHER_DETAILS.pack()
OTHER_DETAILS.configure(bg="white")
Button(app, text = "Save", command = save_data, bg ="green").pack()
Button(app,text = "SEARCH FILE DATA", command = open("deliveries.txt", 'r')).pack()
Button(app,text = "QUIT", command = app.destroy, bg= "red", height =1, width = 20).pack(side = "right", padx = 10, pady = 5)
app.mainloop()
答案 2 :(得分:1)
如果跳过表示您不阅读它们,则不能“跳过”观察结果。但是您可以通过使用IF语句(或其他条件逻辑)来忽略它们。
使用RETAIN和BY分组处理应该会给您答案。
%let year=2015;
data want ;
set a ;
by id yr_change mnth_change ;
length status $20;
retain status ;
if first.id then status='never been married ';
if yr_change <= &year then status=type_change ;
if last.id;
keep id status;
run;
结果:
Obs ID status
1 1 divorce
2 10 divorce
3 23 never been married
4 35 widow
如果您有权访问ID的主列表,则可以转换为使用WHERE语句,该语句可以减少处理所有记录所需的I / O。例如,将ID列表与婚姻状况更改记录的子集合并。
data want;
merge id_list a(in=in2 where=(yr_change <= &year));
by id;
length status $20;
retain status ;
if first.id then status='never been married ';
if in2 then status=type_change ;
if last.id;
keep id status;
run;
答案 3 :(得分:1)
DOW循环可让您计算一组结果。隐式输出将保存为该组计算的结果。由于结果取决于您感兴趣的年份,因此您还将希望在所有创建的数据集中进行追踪。
%let YEAR_CUTOFF = 2015;
data want (keep=id status year_cutoff);
attrib
id length = 8
status length=$20 label="Status at year end &YEAR_CUTOFF"
year_cutoff length = 8
;
retain year_cutoff &YEAR_CUTOFF;
status = 'never been married';
do until (last.ID); /* The DOW loop */
set have (rename=status=status_of_interest);
by id;
if year <= &YEAR_CUTOFF then status = status_of_interest;
end;
/* No explicit OUTPUT in the step, so,
* an implicit OUTPUT occurs here at the bottom of the step
*/
run;