这是Fips数据集
State Fips State.Abbreviation ANSI.Code GU.Name
1 1 67 AL 2403054 Abbeville
2 1 73 AL 2403063 Adamsville
3 1 117 AL 2403069 Alabaster
4 1 95 AL 2403074 Albertville
5 1 123 AL 2403077 Alexander City
6 1 107 AL 2403080 Aliceville
7 1 39 AL 2403097 Andalusia
8 1 15 AL 2403101 Anniston
:
:
:
41774 51 720 VA 1498434 Norton
41775 51 730 VA 1498435 Petersburg
41776 51 735 VA 1498436 Poquoson
41777 51 740 VA 1498556 Portsmouth
41778 51 750 VA 1498438 Radford
41779 51 760 VA 1789073 Richmond
41780 51 770 VA 1498439 Roanoke
41781 51 775 VA 1789074 Salem
41782 51 790 VA 1789075 Staunton
41783 51 800 VA 1498560 Suffolk
41784 51 810 VA 1498559 Virginia Beach
41785 51 820 VA 1498443 Waynesboro
41786 51 830 VA 1789076 Williamsburg
41787 51 840 VA 1789077 Winchester
dim(fips)
[1] 2937 5
这是数据头癌
PUBCSNUM REG MAR_STAT RACE1V NHIADE SEX FIPS Fips State State.Abbreviation
1 93261752 1544 2 15 0 1 3 3 34 NY
2 93264865 1544 2 1 0 1 15 15 34 NY
3 93268186 1544 2 1 0 1 5 5 34 NY
4 93272027 1544 2 1 0 2 17 17 34 NY
5 93274555 1544 1 1 0 1 13 13 34 NY
6 93275343 1544 5 1 0 2 25 25 34 NY
7 93279759 1544 5 1 0 2 9 9 34 NY
8 93280754 1544 2 1 0 2 35 35 34 NY
9 93281166 1544 2 1 0 2 31 31 34 NY
10 93282602 1544 5 1 0 1 33 33 34 NY
11 93287646 1544 1 1 0 1 11 11 34 NY
12 93288255 1544 4 1 4 1 39 39 34 NY
13 93290660 1544 9 1 0 2 25 25 34 NY
14 93291461 1544 1 1 6 1 39 39 34 NY
15 93291778 1544 2 1 0 1 3 3 34 NY
dim(headcancer)
[1] 75313 10
当我合并在一起时,我希望使用head.cancer 75313行获得相同的行,但是我有951423行。
这是我的代码和输出
n = merge(head.cancer,fips, by=c('State','Fips','State.Abbreviation'), all.x= TRUE)
State Fips State.Abbreviation PUBCSNUM REG MAR_STAT RACE1V NHIADE SEX FIPS ANSI.Code GU.Name
1 6 5 CA 70128269 1541 4 1 0 2 5 2409693 Amador City
2 6 5 CA 70128269 1541 4 1 0 2 5 2411446 Plymouth
3 6 5 CA 70128269 1541 4 1 0 2 5 226085 Jackson
4 6 5 CA 70128269 1541 4 1 0 2 5 1675841 Amador
5 6 5 CA 70128269 1541 4 1 0 2 5 2418631 Ione Band of Miwok
6 6 5 CA 70128269 1541 4 1 0 2 5 2412019 Sutter Creek
7 6 5 CA 70128269 1541 4 1 0 2 5 2410110 Ione
8 6 5 CA 70128269 1541 4 1 0 2 5 2410128 Jackson
9 6 5 CA 67476209 1541 2 1 1 2 5 2409693 Amador City
10 6 5 CA 67476209 1541 2 1 1 2 5 2411446 Plymouth
11 6 5 CA 67476209 1541 2 1 1 2 5 226085 Jackson
12 6 5 CA 67476209 1541 2 1 1 2 5 1675841 Amador
13 6 5 CA 67476209 1541 2 1 1 2 5 2418631 Ione Band of Miwok
14 6 5 CA 67476209 1541 2 1 1 2 5 2412019 Sutter Creek
15 6 5 CA 67476209 1541 2 1 1 2 5 2410110 Ione
16 6 5 CA 67476209 1541 2 1 1 2 5 2410128 Jackson
17 6 5 CA 56544761 1541 4 1 0 2 5 2409693 Amador City
18 6 5 CA 56544761 1541 4 1 0 2 5 2411446 Plymouth
19 6 5 CA 56544761 1541 4 1 0 2 5 226085 Jackson
20 6 5 CA 56544761 1541 4 1 0 2 5 1675841 Amador
dim(n)
[1] 951423 12
第一行到第8行“PUBCSNUM”重复8次,“PUBCSNUM”是ID,所以它是唯一的,“ANSI.Code”只有1个值,现在它们有很多值。我不知道为什么它像那样复制
请帮助我,我坚持了几个小时,但我无法理解。感谢