识别和删除指定长度的数据帧中的组

时间:2012-05-17 10:39:27

标签: r

我想知道是否有人可以提供帮助。我有一个简单的数据框:

       spp rep ac.temp            sr ink.temp flux.type      flux             unit
1  Ecklonia   1      19 Ecklonia:1:19     10.1         R 0.1614302 mg O2 gDW-1 hr-1
2  Ecklonia   1      19 Ecklonia:1:19     19.0         R 0.6558070 mg O2 gDW-1 hr-1
3  Ecklonia   1      19 Ecklonia:1:19     24.7         R 0.8777117 mg O2 gDW-1 hr-1
4  Ecklonia   1      19 Ecklonia:1:19     28.9         R 2.3192236 mg O2 gDW-1 hr-1
5  Ecklonia   1      20 Ecklonia:1:20     10.3         R 0.5050336 mg O2 gDW-1 hr-1
6  Ecklonia   1      20 Ecklonia:1:20     20.8         R 1.2928442 mg O2 gDW-1 hr-1
7  Ecklonia   1      20 Ecklonia:1:20     24.8         R 1.8159838 mg O2 gDW-1 hr-1
8  Ecklonia   1      20 Ecklonia:1:20     29.8         R 2.8463946 mg O2 gDW-1 hr-1
9  Ecklonia   1      21 Ecklonia:1:21     10.3         R 0.5214549 mg O2 gDW-1 hr-1
10 Ecklonia   1      21 Ecklonia:1:21     19.5         R 0.9994689 mg O2 gDW-1 hr-1

我想删除列数据$ sr中长度小于3个元素的唯一组合(不仅仅是显示的组合)的行。

任何人都知道自动执行此操作的方法吗?

提前致谢。

2 个答案:

答案 0 :(得分:0)

这有帮助吗?

a <- read.table(textConnection("
1  Ecklonia   1      19 Ecklonia:1:19     10.1         R 0.1614302 mg O2 gDW-1 hr-1
2  Ecklonia   1      19 Ecklonia:1:19     19.0         R 0.6558070 mg O2 gDW-1 hr-1
3  Ecklonia   1      19 Ecklonia:1:19     24.7         R 0.8777117 mg O2 gDW-1 hr-1
4  Ecklonia   1      19 Ecklonia:1:19     28.9         R 2.3192236 mg O2 gDW-1 hr-1
5  Ecklonia   1      20 Ecklonia:1:20     10.3         R 0.5050336 mg O2 gDW-1 hr-1
6  Ecklonia   1      20 Ecklonia:1:20     20.8         R 1.2928442 mg O2 gDW-1 hr-1
7  Ecklonia   1      20 Ecklonia:1:20     24.8         R 1.8159838 mg O2 gDW-1 hr-1
8  Ecklonia   1      20 Ecklonia:1:20     29.8         R 2.8463946 mg O2 gDW-1 hr-1
9  Ecklonia   1      21 Ecklonia:1:21     10.3         R 0.5214549 mg O2 gDW-1 hr-1
10 Ecklonia   1      21 Ecklonia:1:21     19.5         R 0.9994689 mg O2 gDW-1 hr-1"), header=F)

a[a$V5 %in% names(table(a$V5))[table(a$V5) >= 3],]

答案 1 :(得分:0)

您可以使用?table来计算元素的出现次数。

## create example dataframe
df <- data.frame(rep=rep(1, 16),
                sr=c(rep("Ecklonia:1:19", 4),
                    rep("Ecklonia:1:20", 3),
                    rep("Ecklonia:1:21", 2),
                    rep("Ecklonia:1:22", 4),
                    rep("Ecklonia:1:23", 2),
                    "Ecklonia:1:24"),
                stringsAsFactors=FALSE);

## count occurrence of elements in column "sr"
x <- table(df$sr);

## keep only elements which occure at least 3 times
keep <- df$sr %in% names(x)[x >= 3];

df[keep, ]