Rhrhrjrjdjxjxjdiebejejdjdjddhdbdbd dbdbdbbddb
答案 0 :(得分:1)
如果我正确地理解了您的问题,那么您可以省去for循环,因为R在您的工具列表中可以实现矢量安全。使用tidyverse
,您的代码应如下所示:
# load tidyverse
library(tidyverse)
# set vector of instruments
instru = c("Accordian", "Clarinet", "Trumpet", "DoubleBass", "Oboe", "Piano", "Saxophone", "Violin", "Cello", "Tuba", "Viola", "Bassoon", "EnglishHorn", "French horn", "Flute", "Piccolo", "SynthBass", "Trombone")
# create dummy train data.frame (more exactly a "tibble")
train <- tibble(mix1_instrument = c("a", "b", "Clarinet"),
mix2_instrument = c("a", "Clarinet", "c"),
xxx = c("Clarinet", "b", "c"))
#> train
## A tibble: 3 x 3
#mix1_instrument mix2_instrument xxx
#<chr> <chr> <chr>
#1 a a Clarinet
#2 b Clarinet b
#3 Clarinet c c
# add column "instruments" to train
train <- train %>%
mutate(instruments = case_when(
mix1_instrument %in% instru ~ "1",
mix2_instrument %in% instru ~ "1",
TRUE ~"0"
))
#> train
## A tibble: 3 x 4
# mix1_instrument mix2_instrument xxx instruments
# <chr> <chr> <chr> <chr>
#1 a a Clarinet 0
#2 b Clarinet b 1
#3 Clarinet c c 1
答案 1 :(得分:0)
如果您熟悉dplyr
,则可以使用mutate完成此操作。
instru = c("Accordian", "Clarinet", "Trumpet", "DoubleBass", "Oboe", "Piano", "Saxophone", "Violin", "Cello", "Tuba", "Viola",
"Bassoon", "EnglishHorn", "French horn", "Flute", "Piccolo", "SynthBass", "Trombone")
mix1_instruments = c("Accordion", "Trumpet", "Violin", "Cello", "Triangle")
mix2_instruments = c("Bassoon", "Saxophone", "Flute", "French horn", "Washboard")
train = data.frame(mix1_instruments, mix2_instruments)
train <- train %>%
mutate(instruments = (mix1_instruments %in% instru) | (mix2_instruments %in% instru))
输出为TRUE
或FALSE
,但它们也可以转换为0或1。
train$instruments <- as.numeric(train$instruments)
编辑:刚才看到我在写响应时被挖出了(好得多!),但是存在可伸缩性问题。
以下内容将插入名称为<old_column_name>_instruments
的新列,并为其添加逻辑,以确保该列中的每个条目是否都在instru中,然后将它们合并到一个包含 any <中逻辑值的列中/ em>列在instru中包含一个条目:
instru = c("Accordian", "Clarinet", "Trumpet", "DoubleBass", "Oboe", "Piano", "Saxophone", "Violin", "Cello", "Tuba", "Viola",
"Bassoon", "EnglishHorn", "French horn", "Flute", "Piccolo", "SynthBass", "Trombone")
mix1_instruments = c("Clarinet", "Flute", "Clarinet", "English Horn", "Washboard", "Saxophone", "Washboard")
mix2_instruments = c("French Horn", "French Horn", "French Horn", "Flute", "Flute", "Triangle", "Triangle")
train = data.frame(mix1_instruments, mix2_instruments)
train %<>%
mutate_all(funs(instruments = . %in% instru)) %>%
unite(col = instruments,
ends_with('_instruments_instruments'), # optional, iterates only over columns added by unite in this particular dataset
remove=T) %>%
mutate(instruments = as.numeric(grepl('TRUE', instruments)))
输出:
train
# mix1_instruments mix2_instruments instruments
#1 Clarinet French Horn 1
#2 Flute French Horn 1
#3 Clarinet French Horn 1
#4 English Horn Flute 1
#5 Washboard Flute 1
#6 Saxophone Triangle 1
#7 Washboard Triangle 0
注意:%<>%
来自magrittr
,并且仅替换了x <- x %>% ...
语法
您可以output a dataframe with the write.x functions作为CSV输出:
write.csv(train, "/path/to/dir/filename.csv", row.names=F)