Question

df <- data.frame(ID=c("1", "1", "2", "2", "3", "3", "4", "4", "4", "4"), Product=c("A", "B", "B", "A", "C", "C", "A", "B", "C", "C"))

我想获得一列“产品”列的值，这些列按“ID”列分组。 Set应该是逗号分隔的字符串。它应表示一个字符串值，其中每个可能的Product值组合只出现一次。结果是唯一的ID值，可能的产品值组合没有重复值。

我的认可让我走了一半：

library(dplyr)
df2<-df %>% group_by(ID) %>% summarise(Set = toString(unique(Product)))

输出： ID集（fctr）（chr） 1 A，B 2 B，A 3 C. 4 A，B，C

输出是一个字符串变量Set，它具有变量Product的值的组合，但它们可以重复，即A，B！= B，A是不合需要的。我想使用一个函数，允许我的工作流程获得一个Set变量，其中A，B = B，A等，因此产品频率也将匹配不同的规则。

预期产量： ID集（fctr）（chr）1 A，B 2 A，B 3 C 4 A，B，C因此，当我进行汇总统计时，它将显示值A，B在数据集中2x（与A相比， B 1次和B，A 1次）

谁知道？

Answer 1

我们可以在每个'ID'中使用sort到df %>% group_by(ID) %>% summarise(Product = toString(unique(sort(Product)))) # ID Product # (fctr) (chr) #1 1 A, B #2 2 A, B #3 3 C #4 4 A, B, C'产品'

data.table

使用library(data.table) setDT(df)[, list(Product = toString(unique(sort(Product)))) , by = ID]语法的替代方法是

base R

或aggregate(Product~ID, df,FUN= function(x) toString(unique(sort(x))))

bootbox.dialog({
  message: message,
  title: "Log In",
  buttons: {
    success: {
      label: "Submit",
      className: "btn-success",
      callback: function() {

        var username = $('.bootbox-body input[name="username"]').val();
        var password = $('.bootbox-body input[name="password"]').val();

        $.post('ajax/sign_in.php', {
          username: username,
          password: password
        }, function() {}, 'json').done(function(o) {

          if (o.success) {
            //send them to a new page
          }
          if (o.error) {
            $('#login_error').html(o.error);
            //oops!  the bootbox has closed
          }
        });
      }
    }
  }
});

创建在按标识列分组的列中出现的唯一值的Set（String）变量

1 个答案: