我遇到了一个问题,我需要合并两个数据帧但公共列有不同的情况(一些有大写,一些有小写)
示例数据:
authors <- data.frame(
surname = I(c("Tukey", "Venables", "Tierney", "Ripley", "McNeil")),
nationality = c("US", "Australia", "US", "UK", "Australia"),
deceased = c("yes", rep("no", 4)))
books <- data.frame(
name = I(c("tukey", "venables", "tierney",
"tipley", "ripley", "McNeil", "R Core")),
title = c("Exploratory Data Analysis",
"Modern Applied Statistics ...",
"LISP-STAT",
"Spatial Statistics", "Stochastic Simulation",
"Interactive Data Analysis",
"An Introduction to R"),
other.author = c(NA, "Ripley", NA, NA, NA, NA,
"Venables & Smith"))
m1 <- merge(authors, books, by.x = "surname", by.y = "name")
数据取自this question
我需要在不改变数据的情况下产生结果,即
1)我无权在数据框或
中创建新列2)更改数据框中的大小写或
3)创建一个新的数据帧。
我知道R依赖于案例,但我们非常感谢您的帮助。感谢。
答案 0 :(得分:0)
假设您可以创建临时数据帧,我会执行以下操作。
在R中翻译,给出了。
library(stringr)
books_temp <- books # store in temporary df
authors_temp <- authors
authors_temp$surname_temp <- str_to_title(authors_temp$surname) # transform columns
books_temp$name_temp <- str_to_title(books_temp$name)
m1 <- merge(authors_temp, books_temp, by.x = "surname_temp", by.y = "name_temp") # merge
m1$surname_temp <- NULL # discard unnecessary information
rm(authors_temp)
rm(books_temp)
请注意,要合并包含需要处理的信息的两个数据帧而不在某处存储中间转换将非常困难。