使用重复集并重置组

时间:2017-06-26 09:15:37

标签: duplicates stata

我有以下类型的数据(其他变量可以完全随机):

Name  Member  Other variables
 AAA    0
 AAA    0
 AAA    1
 BBB    0
 BBB    0
 CCC    1

请注意,1可以出现在任何位置的一组重复项中,但每个重复项只会出现一个。

我想删除重复内容如下:

  • 如果成员为1,则无需担心,不会出现重复问题(例如CCC)。
  • 如果所有重复的成员都是0,那也可以(例如,BBB
  • 如果成员有1而其余成员0,那么该副本集的所有其他行都需要等于1

我尝试过使用duplicates,使用_N_n等自定义例程,但由于我不知道如何循环使用一次重复一组(也查看了foreach等)。

最终结果应如下所示:

Name  Member  Other variables
 AAA    1
 AAA    1
 AAA    1
 BBB    0
 BBB    0
 CCC    1

我想到的一件事是,如果我能够以某种方式一次使用一个组,我可以将max()应用于每个重复块的成员列,这将产生我想要的。但问题是,我不知道如何一次与一个小组合作。

加成:

如果我还可以在此更改后消除重复并到达下面的设置,那将是一个很好的奖励。但我认为,一旦上述步骤明确,我就知道如何到达那里。

Name  Member  Other variables
AAA    1
BBB    0
CCC    1

1 个答案:

答案 0 :(得分:3)

以下对我有用:

Error relocating /usr/lib/python3.6/site-packages/PyQt5/QtNetwork.so: PyTuple_SetItem: symbol not found
Error relocating /usr/lib/python3.6/site-packages/PyQt5/QtNetwork.so: PyLong_AsLong: symbol not found
Error relocating /usr/lib/python3.6/site-packages/PyQt5/QtNetwork.so: PyDict_GetItemString: symbol not found
Error relocating /usr/lib/python3.6/site-packages/PyQt5/QtNetwork.so: PyObject_GetAttrString: symbol not found
...
Error relocating /usr/lib/python3.6/site-packages/PyQt5/QtNetwork.so: PyExc_TypeError: symbol not found
Traceback (most recent call last):
  File "/usr/bin/pyinstaller", line 11, in <module>
    sys.exit(run())
  File "/usr/lib/python3.6/site-packages/PyInstaller/__main__.py", line 111, in run
    run_build(pyi_config, spec_file, **vars(args))
  File "/usr/lib/python3.6/site-packages/PyInstaller/__main__.py", line 63, in run_build
    PyInstaller.building.build_main.main(pyi_config, spec_file, **kwargs)
  File "/usr/lib/python3.6/site-packages/PyInstaller/building/build_main.py", line 838, in main
    build(specfile, kw.get('distpath'), kw.get('workpath'), kw.get('clean_build'))
  File "/usr/lib/python3.6/site-packages/PyInstaller/building/build_main.py", line 784, in build
    exec(text, spec_namespace)
  File "<string>", line 29, in <module>
  File "/usr/lib/python3.6/site-packages/PyInstaller/building/api.py", line 433, in __init__
    self.__postinit__()
  File "/usr/lib/python3.6/site-packages/PyInstaller/building/datastruct.py", line 158, in __postinit__
    self.assemble()
  File "/usr/lib/python3.6/site-packages/PyInstaller/building/api.py", line 587, in assemble
    self.name)
  File "/usr/lib/python3.6/site-packages/PyInstaller/compat.py", line 509, in exec_command_all
    stdout=subprocess.PIPE, stderr=subprocess.PIPE, **kwargs)
  File "/usr/lib/python3.6/subprocess.py", line 709, in __init__
    restore_signals, start_new_session)
  File "/usr/lib/python3.6/subprocess.py", line 1344, in _execute_child
    raise child_exception_type(errno_num, err_msg, err_filename)
FileNotFoundError: [Errno 2] No such file or directory: 'objcopy': 'objcopy'
Traceback (most recent call last):
  File "/usr/bin/fbs", line 11, in <module>
    sys.exit(_main())
  File "/usr/lib/python3.6/site-packages/fbs/__main__.py", line 17, in _main
    fbs.cmdline.main()
  File "/usr/lib/python3.6/site-packages/fbs/cmdline.py", line 30, in main
    fn(*args)
  File "/usr/lib/python3.6/site-packages/fbs/builtin_commands/__init__.py", line 118, in freeze
    freeze_linux(debug=debug)
  File "/usr/lib/python3.6/site-packages/fbs/freeze/linux.py", line 8, in freeze_linux
    run_pyinstaller(debug=debug)
  File "/usr/lib/python3.6/site-packages/fbs/freeze/__init__.py", line 36, in run_pyinstaller
    run(args, check=True)
  File "/usr/lib/python3.6/subprocess.py", line 418, in run
    output=stdout, stderr=stderr)

clear

input str3 Name  Member
AAA 0
AAA 0
AAA 1
BBB 0
BBB 0
CCC 1
end

bysort Name (Member) : egen Wanted1 = max(Member)

两者都能产生所需的输出:

bysort Name (Member) : generate Wanted2 = Member[_N]

请注意,在任何(子)观测集中,list, sepby(Name) +-----------------------------------+ | Name Member Wanted1 Wanted2 | |-----------------------------------| 1. | AAA 0 1 1 | 2. | AAA 0 1 1 | 3. | AAA 1 1 1 | |-----------------------------------| 4. | BBB 0 0 0 | 5. | BBB 0 0 0 | |-----------------------------------| 6. | CCC 1 1 1 | +-----------------------------------+ _n1。因此, _N总是索引 last 观察。

奖金:

_N