I want to add a variable to ALL dataframes in my global environment and make the value of the newly added column equal to the dataframe name.
Product=c("A","A","A","A","A","A","A","A","A","A","A","A","B","B","B","C","C","C")
Day=c("Monday","Tuesday","Wednesday","Thursday","Friday","Saturday","Monday","Tuesday","Wednesday","Thursday","Friday","Saturday","Monday","Tuesday","Wednesday","Saturday","Sunday" ,"Monday")
data1=data.frame(Product, Day)
Product2=c("Z","Z","Z","Z","Z","Z","Z","Z","Z","Z","Z","Z","Y","Y","Y","X","X","X")
Day2=c("Monday","Tuesday","Wednesday","Thursday","Friday","Saturday","Monday","Tuesday","Wednesday","Thursday","Friday","Saturday","Monday","Tuesday","Wednesday","Saturday","Sunday" ,"Monday")
data2=data.frame(Product2, Day2)
I want to add a column in both dataframes whose value is equal to the dataframe name, i.e newvar="data1" for data1 and newvar="data2" for data2. My actual data frame list is much longer than this.
Any help is greatly appreciated.
Thanks!
答案 0 :(得分:2)
If the 'data.frame' object names are 'data' followed by number, we can either use paste
to get the object names as a string (if we already know the object names)
nm1 <- paste0('data', 1:2)
Or another option would be to use ls
with the pattern argument if there are 100's of object names in the global environment and we don't know how many objects are present.
nm1 <- ls(pattern='^data\\d+')
Get the values in a list
using mget
, and create a new column ('newvar') by cbind
ing with Map
. Using Map
make sure that each dataset in the list
is added with a new column corresponding to the object names.
lst <- Map(cbind, mget(nm1), newvar= nm1)
It is better to keep it in a list
as it can do all the operations within it. But, if the original object needs to be updated in the global environment, list2env
is a an option (not recommended though)
list2env(lst, envir=.GlobalEnv)
I may be also useful to read all the files (.csv/.txt
) in a list
directly rather than creating individual objects. For example, we can read all the files in the working directory by
files <- list.files()
lst <- lapply(files, read.csv, stringsAsFactors=FALSE)
The arguments may need some changes according to the delimiter.
答案 1 :(得分:2)
Here's a function, where you can pass any arbitrary number of named data.frames, and it will return a list of named data.frames back with the requested column added. Using the list2env
function (as in @akrun's answer) you can then put these in whatever environment you want. (You could also modify the function to produce that side-effect automatically.)
f <- function(...) {
objnames <- as.character(substitute(c(...)))[-1]
obj <- list(...)
out <- mapply(function(x, col) {
x[, col] <- col
x
}, obj, objnames, SIMPLIFY = FALSE)
setNames(out, objnames)
}
Here's how to use it:
list2env(f(data1,data2), .GlobalEnv)
# <environment: R_GlobalEnv>
str(data1)
# 'data.frame': 18 obs. of 3 variables:
# $ Product: Factor w/ 3 levels "A","B","C": 1 1 1 1 1 1 1 1 1 1 ...
# $ Day : Factor w/ 7 levels "Friday","Monday",..: 2 6 7 5 1 3 2 6 7 5 ...
# $ data1 : chr "data1" "data1" "data1" "data1" ...
str(data2)
# 'data.frame': 18 obs. of 3 variables:
# $ Product2: Factor w/ 3 levels "X","Y","Z": 3 3 3 3 3 3 3 3 3 3 ...
# $ Day2 : Factor w/ 7 levels "Friday","Monday",..: 2 6 7 5 1 3 2 6 7 5 ...
# $ data2 : chr "data2" "data2" "data2" "data2" ...
If you had a large number of named objects that you wanted to pass without listing them explicitly in f()
, you could do something like:
list2env(do.call(f, sapply(ls(pattern = "data"), as.name)), .GlobalEnv)
which would have the same result.