如何使列中的某些元素彼此相等?

时间:2019-07-03 15:57:07

标签: r dataframe

我的数据集中有一个名为“活动”的列,其中包含以下条目:

 this.addUserForm = this.formBuilder.group({
      loginId: ['', [Validators.required, Validators.minLength(4),Validators.maxLength(20)]],
      name: ['', [Validators.required, Validators.minLength(2),Validators.maxLength(20)]],
      email:  ['',[Validators.required,Validators.pattern('[a-zA-Z0-9.-]{1,}@[a-zA-Z.-]{2,}[.]{1}[a-zA-Z]{2,}')]],
      mobile: ['', [Validators.required, Validators.pattern("[0-9]+"), Validators.minLength(10),Validators.maxLength(10)]],
      groupId:['',Validators.required],
      is2FAEnabled: [false,[Validators.required]],
      AccessRights: ['',[Validators.required]]
    });
  }


  onNgModelChange(evt) {

  onSelect(value){
      console.log("value",value)
      if(value == 2){
                console.log("evt2")
                this.addUserForm.get('AccessRights').setValue(1);
      }
  }

我想通过以下代码将其更改为数值数据:

 (04)WORKING AT HOME (for pay)
 (03)AT HOME ACTIVITIES
 (02)WORK
 (01)WORK RELATED
 (07) Pick-up or drop-off passenger (non-work/non-school)
 (05) Drop off/Pick-up someone at their work
 (08) Drop off/Pick-up someone at their school
 (09)CHANGE MODE OF TRAVEL
 (10)TRANSFER BETWEEN

我想要

as.numeric(df$activity)

获得相同的号码。

我该怎么做?

1 个答案:

答案 0 :(得分:0)

如果只有几个因素,可以使用ifelse(或dplyr::if_else)或类似的结构来完成。我什至做过各种“查找词典”,例如

somedict <- c("some1"=1L, "any2"=4L, "all9"=4L)
somedict[ c("any2", "all9", "all9", "all9", "some1", "some1") ]
#  any2  all9  all9  all9 some1 some1 
#     4     4     4     4     1     1 

但是我认为,既然您有很多,那么更好的构造就是另外一个框架,并merge(或dplyr::left_join)放入其中。

possibles <- c(
 "WORKING AT HOME (for pay)",
 "AT HOME ACTIVITIES",
 "WORK",
 "WORK RELATED",
 "DROP OFF/PICK-UP SOMEONE AT THEIR WORK",
 "DROP OFF/PICK-UP SOMEONE AT THEIR SCHOOL",
 "PICK-UP OR DROP-OFF PASSENGER (NON-WORK/ NON-SCHOOL)",
 "CHANGE MODE OF TRAVEL",
 "TRANSFER BETWEE")

nums <- data.frame(topic = possibles, stringsAsFactors = FALSE)
nums$num <- seq_len(nrow(nums))
nums$num[grepl("DROP", nums$topic)] <- min(nums$num[ grepl("DROP", nums$topic) ])
nums
#                                                  topic num
# 1                            WORKING AT HOME (for pay)   1
# 2                                   AT HOME ACTIVITIES   2
# 3                                                 WORK   3
# 4                                         WORK RELATED   4
# 5               DROP OFF/PICK-UP SOMEONE AT THEIR WORK   5
# 6             DROP OFF/PICK-UP SOMEONE AT THEIR SCHOOL   5
# 7 PICK-UP OR DROP-OFF PASSENGER (NON-WORK/ NON-SCHOOL)   5
# 8                                CHANGE MODE OF TRAVEL   8
# 9                                      TRANSFER BETWEE   9


set.seed(2)
dat <- data.frame(topic = sample(possibles, size=1000, replace=TRUE),
                  id = 1:1000,
                  stringsAsFactors = FALSE)
head(dat)
#                                                  topic id
# 1                                   AT HOME ACTIVITIES  1
# 2 PICK-UP OR DROP-OFF PASSENGER (NON-WORK/ NON-SCHOOL)  2
# 3             DROP OFF/PICK-UP SOMEONE AT THEIR SCHOOL  3
# 4                                   AT HOME ACTIVITIES  4
# 5                                      TRANSFER BETWEE  5
# 6                                      TRANSFER BETWEE  6

newdat <- merge(dat, nums, by.x="topic", by.y="topic", all.x=TRUE, sort=FALSE)
newdat <- newdat[ order(newdat$id), ]
head(newdat)
#                                                    topic id num
# 1                                     AT HOME ACTIVITIES  1   2
# 175 PICK-UP OR DROP-OFF PASSENGER (NON-WORK/ NON-SCHOOL)  2   5
# 272             DROP OFF/PICK-UP SOMEONE AT THEIR SCHOOL  3   5
# 4                                     AT HOME ACTIVITIES  4   2
# 397                                      TRANSFER BETWEE  5   9
# 335                                      TRANSFER BETWEE  6   9

此方法在很大程度上依赖于预先了解所有因素,这可能被视为一个弱点。不过,它的优点之一是,您(应该)立即看到一个新的(或拼写错误的)主题,因为在该NA列中将有一个num