我有两个数据框,每行代表一个人的数据。第一个数据框中的线(输入几何形态测量的特定分析)对应于第二个数据框中的线(动物作为采样点或性别的其他描述)。我想基于来自第二数据帧的条件来对第一数据帧进行子集化(例如,选择第一数据帧的所有行是女性,但动物的性别在第二数据帧中定义)。可以通过向第一个数据框添加新列,基于此新列对其进行子集并删除列来实现。还有其他更优雅的方法吗?
df1
[,1] [,2] [,3] [,4] [,5] [,6]
IMGP6995.JPG -0.07612235 0.08189661 0.020690012 0.07532420 0.05373111 0.07139840
IMGP6997.JPG -0.06759482 0.09449720 0.022907275 0.08807724 0.05953926 0.08256468
IMGP6998.JPG -0.06902234 0.08418980 0.013522385 0.08186618 0.05375763 0.07769076
IMGP6999.JPG -0.07201136 0.08475765 0.009462017 0.08080315 0.06148776 0.07059229
IMGP7001.JPG -0.08112908 0.08485488 0.037193459 0.07971364 0.05834018 0.07917079
IMGP7012.JPG -0.07059829 0.07905529 0.021803102 0.07480276 0.04849282 0.07270644
IMGP7013.JPG -0.07176010 0.08561111 0.009568661 0.08297752 0.06374573 0.08272648
IMGP7014.JPG -0.06751993 0.08895038 0.016800152 0.08799522 0.04776876 0.08100145
IMGP7015.JPG -0.07945826 0.07844136 0.008176800 0.07431915 0.06471417 0.07348312
IMGP7017.JPG -0.06587874 0.09280032 0.010204330 0.09085868 0.05290771 0.08739235
df2
number site m m..evis. m..gonads. sex SL TL AP RP
37 10 KB 1.263 1.003 0.136 F 39.38949 47.72564 NA NA
38 11 KB 4.215 3.510 0.093 F 53.48064 65.29663 NA NA
39 12 KB 3.508 2.997 0.079 F 51.59589 64.76600 NA NA
40 13 KB 3.250 2.752 0.085 F 49.55853 61.74319 NA NA
41 14 KB 3.596 3.149 0.101 F 51.42303 64.79511 NA NA
42 10 KKB 3.257 2.451 0.270 M 55.07909 67.52057 1468.017 598.9462
43 11 KKB 3.493 2.275 0.666 M 54.24882 65.61726 1722.414 757.1050
44 12 KKB 3.066 2.210 0.300 M 53.56323 64.09848 1410.891 638.4123
45 13 KKB 3.294 2.193 0.652 M 51.66717 63.49136 1428.063 651.1915
46 14 KKB 2.803 1.871 0.582 M 50.91185 60.90951 1236.438 660.8433
子集后的df1
[,1] [,2] [,3] [,4] [,5] [,6]
IMGP6995.JPG -0.07612235 0.08189661 0.020690012 0.07532420 0.05373111 0.07139840
IMGP6997.JPG -0.06759482 0.09449720 0.022907275 0.08807724 0.05953926 0.08256468
IMGP6998.JPG -0.06902234 0.08418980 0.013522385 0.08186618 0.05375763 0.07769076
IMGP6999.JPG -0.07201136 0.08475765 0.009462017 0.08080315 0.06148776 0.07059229
IMGP7001.JPG -0.08112908 0.08485488 0.037193459 0.07971364 0.05834018 0.07917079
答案 0 :(得分:2)
df1[df2$sex %in% "F", ]
# [,1] [,2] [,3] [,4] [,5] [,6]
# IMGP6995.JPG -0.07612235 0.08189661 0.020690012 0.07532420 0.05373111 0.07139840
# IMGP6997.JPG -0.06759482 0.09449720 0.022907275 0.08807724 0.05953926 0.08256468
# IMGP6998.JPG -0.06902234 0.08418980 0.013522385 0.08186618 0.05375763 0.07769076
# IMGP6999.JPG -0.07201136 0.08475765 0.009462017 0.08080315 0.06148776 0.07059229
# IMGP7001.JPG -0.08112908 0.08485488 0.037193459 0.07971364 0.05834018 0.07917079
<强>解释强>
您的df1
看起来像matrix
,而不是data.frame
。但是,如果df1
是数据框,我提供的解决方案也会有效。
df2$sex %in% "F"
报告sex
是否与F
匹配。并使用TRUE
和FALSE
报告逻辑向量。之后,您可以使用它来分组df1
。
数据强>
df1 <- matrix(c(-0.07612235, 0.08189661, 0.020690012, 0.07532420, 0.05373111, 0.07139840,
-0.06759482, 0.09449720, 0.022907275, 0.08807724, 0.05953926, 0.08256468,
-0.06902234, 0.08418980, 0.013522385, 0.08186618, 0.05375763, 0.07769076,
-0.07201136, 0.08475765, 0.009462017, 0.08080315, 0.06148776, 0.07059229,
-0.08112908, 0.08485488, 0.037193459, 0.07971364, 0.05834018, 0.07917079,
-0.07059829, 0.07905529, 0.021803102, 0.07480276, 0.04849282, 0.07270644,
-0.07176010, 0.08561111, 0.009568661, 0.08297752, 0.06374573, 0.08272648,
-0.06751993, 0.08895038, 0.016800152, 0.08799522, 0.04776876, 0.08100145,
-0.07945826, 0.07844136, 0.008176800, 0.07431915, 0.06471417, 0.07348312,
-0.06587874, 0.09280032, 0.010204330, 0.09085868, 0.05290771, 0.08739235),
ncol = 6, byrow = TRUE)
rownames(df1) <- c("IMGP6995.JPG", "IMGP6997.JPG", "IMGP6998.JPG", "IMGP6999.JPG",
"IMGP7001.JPG", "IMGP7012.JPG", "IMGP7013.JPG", "IMGP7014.JPG",
"IMGP7015.JPG", "IMGP7017.JPG")
df2 <- read.table(text = " number site m m..evis. m..gonads. sex SL TL AP RP
37 10 KB 1.263 1.003 0.136 F 39.38949 47.72564 NA NA
38 11 KB 4.215 3.510 0.093 F 53.48064 65.29663 NA NA
39 12 KB 3.508 2.997 0.079 F 51.59589 64.76600 NA NA
40 13 KB 3.250 2.752 0.085 F 49.55853 61.74319 NA NA
41 14 KB 3.596 3.149 0.101 F 51.42303 64.79511 NA NA
42 10 KKB 3.257 2.451 0.270 M 55.07909 67.52057 1468.017 598.9462
43 11 KKB 3.493 2.275 0.666 M 54.24882 65.61726 1722.414 757.1050
44 12 KKB 3.066 2.210 0.300 M 53.56323 64.09848 1410.891 638.4123
45 13 KKB 3.294 2.193 0.652 M 51.66717 63.49136 1428.063 651.1915
46 14 KKB 2.803 1.871 0.582 M 50.91185 60.90951 1236.438 660.8433",
header = TRUE, stringsAsFactors = FALSE)