我正在考虑用词/字符串替换数据框中的所有数字。每个数字将替换为完全相同的单词。例如数字5的所有实例应替换为' banana',数字10的所有实例用' kiwi'等等。
以下是一个示例数据框。 Rownames和colnames也是数字:
# 1 2 3 4 5 6
#1 7 7 7 7 7 7
#2 5 5 5 5 5 5
#3 4 4 4 4 4 4
#4 8 8 8 8 8 8
#5 1 1 1 1 1 1
#6 2 2 2 2 2 2
#7 6 6 6 6 3 3
#8 3 3 3 3 6 6
#9 10 10 10 10 10 10
#10 11 11 11 11 11 11
#11 12 12 12 12 12 12
#12 9 9 9 9 9 9
以下是用于复制此内容的示例数据(mydf):
mydf<-structure(c(7, 5, 4, 8, 1, 2, 6, 3, 10, 11, 12, 9, 7, 5, 4, 8,
1, 2, 6, 3, 10, 11, 12, 9, 7, 5, 4, 8, 1, 2, 6, 3, 10, 11, 12,
9, 7, 5, 4, 8, 1, 2, 6, 3, 10, 11, 12, 9, 7, 5, 4, 8, 1, 2, 3,
6, 10, 11, 12, 9, 7, 5, 4, 8, 1, 2, 3, 6, 10, 11, 12, 9), .Dim = c(12L,
6L), .Dimnames = list(c("1", "2", "3", "4", "5", "6", "7", "8",
"9", "10", "11", "12"), c("1", "2", "3", "4", "5", "6")))
这是我构建的数据框(mydata),显示哪个数字应替换为哪个单词/水果:
mydata <- data.frame(nums = c(1:12))
mydata$fruits<-c("apple", "pear", "orange", "melon", "banana", "grape", "pineapple", "mango", "lemon", "kiwi", "guava", "peach")
我尝试查看类似命名的线程,但他们主要讨论更改数据帧的某些部分(例如特定变量或特定观察),而不是整个数据帧的内容。
我尝试使用多个gsub命令,但由于多种原因,这并不起作用。我想我需要使用一个函数来应用df中的所有变量,但不确定是什么。
最终结果应如下所示:
1 2 3 4 5 6
1 "pineapple" "pineapple" "pineapple" "pineapple" "pineapple" "pineapple"
2 "banana" "banana" "banana" "banana" "banana" "banana"
3 "melon" "melon" "melon" "melon" "melon" "melon"
4 "mango" "mango" "mango" "mango" "mango" "mango"
5 "apple" "apple" "apple" "apple" "apple" "apple"
6 "pear" "pear" "pear" "pear" "pear" "pear"
7 "grape" "grape" "grape" "grape" "orange" "orange"
8 "orange" "orange" "orange" "orange" "grape" "grape"
9 "kiwi" "kiwi" "kiwi" "kiwi" "kiwi" "kiwi"
10 "guava" "guava" "guava" "guava" "guava" "guava"
11 "peach" "peach" "peach" "peach" "peach" "peach"
12 "lemon" "lemon" "lemon" "lemon" "lemon" "lemon"
虽然理想情况下,引号不会显示(但我不确定这是否可行)。
答案 0 :(得分:4)
您可以使用match
执行此操作,mydata
引用查找向量(您的mydf[] <- mydata$fruits[match(mydf, mydata$nums)]
),返回另一个向量的每个元素的向量中的位置。
data.frame
如果您强制使用as.data.frame(mydf)
# 1 2 3 4 5 6
# 1 pineapple pineapple pineapple pineapple pineapple pineapple
# 2 banana banana banana banana banana banana
# 3 melon melon melon melon melon melon
# 4 mango mango mango mango mango mango
# 5 apple apple apple apple apple apple
# 6 pear pear pear pear pear pear
# 7 grape grape grape grape orange orange
# 8 orange orange orange orange grape grape
# 9 kiwi kiwi kiwi kiwi kiwi kiwi
# 10 guava guava guava guava guava guava
# 11 peach peach peach peach peach peach
# 12 lemon lemon lemon lemon lemon lemon
,则在将对象打印到屏幕时,引号不可见:
data.frame
无论您是否强迫quote=FALSE
,都可以向write.table
或write.csv
提供{{1}},以防止导出文件中字符串周围出现引号。
答案 1 :(得分:0)
replace
可能会对你有用。
> replace(mydf, seq_along(mydf), mydata[[2]][mydf])
# 1 2 3 4 5 6
# 1 "pineapple" "pineapple" "pineapple" "pineapple" "pineapple" "pineapple"
# 2 "banana" "banana" "banana" "banana" "banana" "banana"
# 3 "melon" "melon" "melon" "melon" "melon" "melon"
# 4 "mango" "mango" "mango" "mango" "mango" "mango"
# 5 "apple" "apple" "apple" "apple" "apple" "apple"
# 6 "pear" "pear" "pear" "pear" "pear" "pear"
# 7 "grape" "grape" "grape" "grape" "orange" "orange"
# 8 "orange" "orange" "orange" "orange" "grape" "grape"
# 9 "kiwi" "kiwi" "kiwi" "kiwi" "kiwi" "kiwi"
# 10 "guava" "guava" "guava" "guava" "guava" "guava"
# 11 "peach" "peach" "peach" "peach" "peach" "peach"
# 12 "lemon" "lemon" "lemon" "lemon" "lemon" "lemon"
如果需要,它可以用as.data.frame
包裹以删除引号。
答案 2 :(得分:0)
由于水果的顺序正确且被1:12
编入索引,您可以使用mydf
的条目来编入mydata$fruits
:
apply(mydf, 2, function(x) mydata$fruits[x])
如果值的顺序不正确,或者未覆盖所有可能的值(有&#34;孔&#34;),您可以使用因子进行翻译:
apply(mydf, 2, function(x) factor(x, levels=mydata$nums, labels=mydata$fruits))
答案 3 :(得分:0)
另一种可能的方法:
library(qdapTools)
as.data.frame(apply(mydf, 2, lookup, mydata))
## 1 2 3 4 5 6
## 1 pineapple pineapple pineapple pineapple pineapple pineapple
## 2 banana banana banana banana banana banana
## 3 melon melon melon melon melon melon
## 4 mango mango mango mango mango mango
## 5 apple apple apple apple apple apple
## 6 pear pear pear pear pear pear
## 7 grape grape grape grape orange orange
## 8 orange orange orange orange grape grape
## 9 kiwi kiwi kiwi kiwi kiwi kiwi
## 10 guava guava guava guava guava guava
## 11 peach peach peach peach peach peach
## 12 lemon lemon lemon lemon lemon lemon