Predominant calculation for Character fields

时间:2016-04-21 22:15:54

标签: r dplyr

I'm trying to loop through my column names where type = character and return one Data frame which contains all the predominant values of each character column, grouped by an ID field.

Is there a way to replicate the following code in some kind of loop?:

      DF_Characters <- DF_Characters[,sapply(dfr,is.character)]

##Predominance Column1##
      Predom <- select(DF_Characters, Group_ID, Column_1)
      Predom <- group_by(Predom,Group_ID, Column_1)
      Predom <- summarise(Predom,
                             CountPredom = n()
                             )
      Predom <- arrange(Predom,Group_ID, desc(CountPredom) )
      Predom <- data.table(Predom, key="Group_ID")
      Predominant_Column_1 <- Predom[,head(.SD,1),by=Group_ID]


##Predominant Column_2##
      Predom <- select(DF_Characters, Group_ID, Column_2)
      Predom <- group_by(Predom,Group_ID, Column_2)
      Predom <- summarise(Predom,
                             CountPredom = n()
                             )
      Predom <- arrange(Predom,Group_ID, desc(CountPredom) )
      Predom <- data.table(Predom, key="Group_ID")
      Predominant_Column_2 <- Predom[,head(.SD,1),by=Group_ID]

##Merge final table##
      Merged <- merge(Predominant_Column_1 ,Predominant_Column_2 ,by="Group_ID")

Also to clarify my question I added a dummy table: DF_Character_table

Result shoul look like this Result Table

So for Group 1 Petre was the predominant name in Column 1 and Car was the predominant mode of travel. Column 1 and Column 2 predominance should be calculated respectively.

Thank you

1 个答案:

答案 0 :(得分:0)

这可能不是最佳解决方案,但可行。

 ##########Predominant Calculations
  #Character fields
  DF_Characters <- as.data.frame(dfr)
  DF_Characters <- DF_Characters[,sapply(dfr,is.character)]

  # Field names without the Group by id
  CharactersToMerge <- c(names(DF_Characters))

  #Add Groupby ID to Character fields
  Character_Field_List <- c("Groupby_ID", names(DF_Characters))
  DF_Characters <- subset(dfr,select = Character_Field_List)

  #Column Names to loop through
  DF_FieldsToMerge <- subset(dfr,select = CharactersToMerge)


  # Predominant Table
  fin_table <- DF_Characters %>% group_by(Groupby_ID) %>%
                  tally(sort = TRUE) #Count observations

  # Loop and merge tables to Predominant Table
  for(i in names(DF_FieldsToMerge)){

  temp_table <- DF_Characters %>% group_by_("Groupby_ID", i ) %>%
                      tally(sort = TRUE)
  temp_table  <- temp_table[,head(.SD,1),by=Groupby_ID] #Remove ties
  temp_table  <- subset(temp_table,select = c("Groupby_ID", i)) #remove counts

  fin_table <- merge(fin_table, temp_table, by="Groupby_ID")
  }