如果一个成员符合R

时间:2016-06-05 18:47:17

标签: r

我有一些关于不同家庭的人口普查数据,就像这样(显然真正的数据集要大得多,还有很多其他变量):

df <- data.frame("HouseholdID" = c(1, 1, 1, 2, 2, 3, 3, 3), 
                 "Age" = c(45, 38, 6, 78, 64, 56, 58, 12))

我有兴趣知道每个成年人是否有未满18岁的孩子,所以我认为最简单的方法可能是在数据框中添加一列:

df$kid_under_18 <- "No"

然后将值更改为&#34;是&#34;对于符合我标准的行。麻烦的是我在编写R代码时遇到了问题:

  

&#34;对于每个HouseholdID,如果有任何Age&lt; 18&#34; &lt; - &#34;是&#34;

我想我应该可以使用&#34;&#34; (即通过HouseholdID查看)和&#34;如果有的话#34;声明,但我无法弄清楚如何改变我的&#34; kid_under_18&#34;基于此的列。我想我已经接近了,但语法还没有:

by(df$Age, df$HouseholdID, function(x) if(any(x < 18)) {df$kid_under_18 <- "Yes"})将评估该语句,但不会在数据框中添加任何内容。

df$kid_under_18 <- by(df$Age, df$HouseholdID, function(x) if(any(x < 18)) print ("Yes"))给我一个错误

  

$<-.data.frame*tmp*,&#34; kid_under_18&#34;,value = list(1 =&#34;是&#34;,:   替换有3行,数据有8个

4 个答案:

答案 0 :(得分:1)

使用库@Named("ProjectBacking") @SessionScoped public class ProjectsBacking implements Serializable { private static final long serialVersionUID = 1L; //Field Properties private String projectName; private ProjectContent currentContent; private String editorContent; private boolean isEdit; //Data Properties private User currentUser; private Project project; private List<ProjectDocument> projectDocuments; private List<ProjectContent> projectContents; private String testContent; @Inject private LoginBacking login; @EJB private TeamDAO teamDAO; @EJB private ProjectDAO projectDAO; @EJB private DocumentDAO documentDAO; @EJB private ChapterDAO chapterDAO; @EJB private ProjectDocumentDAO projectDocumentDAO; @EJB private ProjectChapterDAO projectChapterDAO; @EJB private ProjectContentDAO projectContentDAO; public void onPageLoad() { currentUser = login.getUser(); projectDocuments = projectDocumentDAO.getAllProjectDocumentsOrderedByOrderNumber(); /* When Page first time loaded, set the content on the page to the content of the first chapter of the first document (which currently is "Lastenheft") */ if(currentUser.getTeam().getProject()!= null && currentContent == null) { currentContent = currentUser.getTeam().getProject().getProjectDocuments().get(0).getProjectChapters().get(0).getProjectContent(); } } public void setIsEditTrue() { isEdit = true; } public String createProject() { project = new Project(); project.setProjectName(projectName); project.setTeam(currentUser.getTeam()); project = this.createInitialProjectContents(project); project = projectDAO.createProject(project); currentUser.setTeam(teamDAO.updateTeam(currentUser.getTeam())); return "projects?faces-redirect=true"; } public void setCurrentContentForChapter(ProjectChapter chapter) { currentContent = projectContentDAO.getProjectContentForProjectChapterId(chapter.getId()); editorContent = currentContent.getContent(); isEdit = false; } public void updateProjectContent() { editorContent = editorContent.replaceAll("\\r|\\n", ""); currentContent.setContent(editorContent); currentContent = projectContentDAO.updateProjectContent(currentContent); isEdit = false; } } ,您可以执行以下操作:

dplyr

输出如下:

library(dplyr)
df %>% 
    group_by(HouseholdID) %>% 
    mutate(under_18 = any(Age < 18))

如果您想要每Source: local data frame [8 x 3] Groups: HouseholdID [3] HouseholdID Age under_18 <dbl> <dbl> <lgl> 1 1 45 TRUE 2 1 38 TRUE 3 1 6 TRUE 4 2 78 FALSE 5 2 64 FALSE 6 3 56 TRUE 7 3 58 TRUE 8 3 12 TRUE 行一行,则可以使用summarise代替mutate。您还可以使用mutate中的HouseholdID赋值将逻辑值转换为其他值,例如:

ifelse

答案 1 :(得分:1)

使用data.table

library(data.table)
setDT(df)[, kid_under_18 := any(Age < 18) , HouseholdID]

或者如果我们需要&#39;是&#39;或者&#39;否&#39;塔格

setDT(df)[, kid_under_18 := c("Yes", "No")[any(Age < 18) + 1] , HouseholdID]

答案 2 :(得分:0)

您可能正在寻找摘要吗?

library(plyr);
ddply(df, "HouseholdID", summarise, hasChildUnder18 = any(Age < 18))

  HouseholdID hasChildUnder18
1           1            TRUE
2           2           FALSE
3           3            TRUE

我们可以将TRUEFALSE重新编码为yesno

library(plyr); library(car);
ddply(df, "HouseholdID", summarise, hasChildUnder18 = recode(any(Age < 18), 
                                                     "TRUE='yes'; FALSE='no'"))

  HouseholdID hasChildUnder18
1           1             yes
2           2              no
3           3             yes

答案 3 :(得分:0)

您只需使用ifelse()并使用cbind.data.frame()table()功能查看频率

  > df$kid_under_18 <- ifelse(df$Age < 18,"Yes","No")
  > df
  #      HouseholdID Age kid_under_18
  # 1              1  45           No
  # 2              1  38           No
  # 3              1   6          Yes
  # 4              2  78           No
  # 5              2  64           No
  # 6              3  56           No
  # 7              3  58           No
  # 8              3  12          Yes

> table(cbind.data.frame(df$HouseholdID,df$kid_under_18))
#                 df$kid_under_18
# df$HouseholdID      No Yes
#              1      2   1
#              2      2   0
#              3      2   1