我的df有点小麻烦。首先,我将向您展示一个例子,然后解释我想收到的内容。
我的输入df:
C1 C2 C3 C4 C5 C6 C7 C8
A I I D X I I I
A I I I X D I I
A I I I X I I I
A I D I X NC I I
B D D I X I I I
B D I NC X I I D
C NC I I X NC D I
C I I I X I I I
C I I I X I I D
D NC NC I X D D D
D I I I X D D I
D D D I X I I NC
D I I I X NC I I
E NC I I X I I D
E I I I X I D D
期望的结果:
C1 C2 C3 C4 C5 C6 C7 C8
A I I D X I I I
A I I I X D I I
A I I I X I I I
A I D I X NC I I
我希望只有群组(group by column 'C1'
)(包含所有行),其中在每行中至少有2个出现'I'
(让我们在A
和 C2, C3, C4
小组列中取C6, C7, C8
}。
我决定使用filter()
,all()
和rowSums()
df_filtered <- df %>%
group_by(C1) %>%
filter(all(rowSums(df[,2:4] == 'I' & df[,6:8] == 'I') >= 2))
什么不起作用?它返回0行,不知道为什么......
答案 0 :(得分:2)
#menu {
background-color: #8a6d3b;
background-image: linear-gradient(to bottom, #bba784, #8a6d3b);
background-repeat: repeat-x;
border-color: #c7b595 #8a6d3b #8e6318;
min-height: 40px;
}
#menu .nav > li.active > a {
background-color: #e0c698;
}
<script src="https://ajax.googleapis.com/ajax/libs/jquery/2.1.1/jquery.min.js"></script>
<nav id="menu" class="navbar">
<div class="navbar-header"><span id="category" class="visible-xs">Categories</span>
<button type="button" class="btn btn-navbar navbar-toggle" data-toggle="collapse" data-target=".navbar-ex1-collapse"><i class="fa fa-bars"></i></button>
</div>
<div class="collapse navbar-collapse navbar-ex1-collapse">
<ul class="nav navbar-nav">
<li class="dropdown"><a href="http://mysite/component" class="dropdown-toggle" data-toggle="dropdown">Components</a>
<div class="dropdown-menu">
<div class="dropdown-inner">
<ul class="list-unstyled">
<li><a href="http://mysite/mouse">Mice</a></li>
<li><a href="http://mysite/monitor">Monitors</a></li>
<li><a href="http://mysite/printer">Printers</a></li>
</ul>
</div>
<a href="http://mysite/component" class="see-all">Show All Components</a> </div>
</li>
<li><a href="http://mysite/tablet">Tablets</a></li>
<li><a href="http://mysite/software">Software</a></li>
<li><a href="http://mysite/smartphone">Phones</a></li>
<li><a href="http://mysite/camera">Cameras</a></li>
</ul>
</div>
</nav>
使用时
df %>%
mutate(condition = rowSums(.[2:4] == 'I') >= 2 & rowSums(.[6:8] == 'I') >= 2) %>%
group_by(C1) %>%
filter(all(condition)) %>%
select(-condition)
# A tibble: 4 x 8
# Groups: C1 [1]
C1 C2 C3 C4 C5 C6 C7 C8
<fctr> <fctr> <fctr> <fctr> <fctr> <fctr> <fctr> <fctr>
1 A I I D X I I I
2 A I I I X D I I
3 A I I I X I I I
4 A I D I X NC I I
比较是{em>所有行的filter(all(rowSums(df[,2:4] == 'I' & df[,6:8] == 'I') >= 2))
,而不仅仅是您小组的行。all()
。此方法评估每行的条件,然后仅在组上调用df
。
答案 1 :(得分:0)
您可以尝试unite()
,然后按正则表达式过滤。这是你的例子:
library(tidyverse)
# First loading your data
data <-read.table(text = "C1 C2 C3 C4 C5 C6 C7 C8
A I I D X I I I
A I I I X D I I
A I I I X I I I
A I D I X NC I I
B D D I X I I I
B D I NC X I I D
C NC I I X NC D I
C I I I X I I I
C I I I X I I D
D NC NC I X D D D
D I I I X D D I
D D D I X I I NC
D I I I X NC I I
E NC I I X I I D
E I I I X I D D", header = T)
# Then filtering rows
data %>%
# Creating a helper column
unite(merged, C1:C8, sep = "", remove = F) %>%
# Filtering by regexp
filter(grepl("^A", merged), grepl("II", merged)) %>%
# Deleting helper column
select(-merged)
C1 C2 C3 C4 C5 C6 C7 C8
1 A I I D X I I I
2 A I I I X D I I
3 A I I I X I I I
4 A I D I X NC I I
玩得开心;)