我正在处理一个数据框,其中包含列名,公司名称,部门名称all_production_2017,bad_production_2017 ......多年前回来了
现在我正在编写一个函数,该公司将名称和年作为参数,并总结该公司当年的产量。然后通过降低all_production_ 年
中的顺序对其进行排序我已将年份转换为字符串并过滤所需的行和列。但是我如何按特定列对其进行排序?我不知道如何访问该列名,因为参数year是其后缀。
这是我的数据框架结构的草图。
结构(列表(公司= c(" DLT"," DLT"," DLT"," MSF"," ; MSF"," MSF"), division = c(" Marketing"," CHANG1"," CAHNG2"," MARKETING"," CHANG1M",& #34; CHANG2M&#34), all_production_2000 = c(15,25,25,10,25,18), good_production_2000 = c(10,24,10,8,10,10), bad_production_2000 = c(2,1,2,1,3,5)))
从2000年到2017年的数据 我想写一个给出公司名称和一年的函数。 它可以过滤掉公司和相关的年份,并按顺序对all_production_thatyear进行排序。
到目前为止我已经完成了。
ExportCompanyYear <- function(company.name, year){
year.string <- toString(year)
x <- filter(company.data, company == company.name) %>%
select(company, division, contains(year.string))
}
我只是不知道如何按降序排序,因为我不知道如何访问包含参数年份的列名。
答案 0 :(得分:0)
虽然OP似乎提供了一个非常简单的sample data
,其中只包含2000
年的数据。
解决方法可能是:
1.将列表转换为data.frame
2.使用gather
中的tidyr
以可以应用过滤器的方式排列数据框
ll <- structure(list(company = c("DLT", "DLT", "DLT", "MSF", "MSF", "MSF"),
division = c("Marketing", "CHANG1", "CAHNG2", "MARKETING", "CHANG1M",
"CHANG2M"), all_production_2000 = c(15, 25, 25, 10, 25, 18),
good_production_2000 = c(10, 24, 10, 8, 10, 10),
bad_production_2000 = c(2, 1, 2, 1, 3, 5)))
df <- as.data.frame(ll)
library(tidyr)
gather(df, key = "key", value = "value", -c("company", "division"))
#result:
# company division key value
#1 DLT Marketing all_production_2000 15
#2 DLT CHANG1 all_production_2000 25
#3 DLT CAHNG2 all_production_2000 25
#4 MSF MARKETING all_production_2000 10
#5 MSF CHANG1M all_production_2000 25
#6 MSF CHANG2M all_production_2000 18
#7 DLT Marketing good_production_2000 10
#8 DLT CHANG1 good_production_2000 24
#9 DLT CAHNG2 good_production_2000 10
#10 MSF MARKETING good_production_2000 8
#11 MSF CHANG1M good_production_2000 10
#12 MSF CHANG2M good_production_2000 10
#13 DLT Marketing bad_production_2000 2
#14 DLT CHANG1 bad_production_2000 1
#15 DLT CAHNG2 bad_production_2000 2
现在,可以在上面的data.frame上轻松应用过滤器。
答案 1 :(得分:0)
You definitely need to reshape your data in such a way that year
values could be passed as a parameter.
To create a reproducible example, I have added another year 2001
in the data.
df = data.frame(company = c("DLT", "DLT", "DLT", "MSF", "MSF", "MSF"), division = c("Marketing", "CHANG1", "CAHNG2", "MARKETING", "CHANG1M", "CHANG2M"), all_production_2000 = c(15, 25, 25, 10, 25, 18), good_production_2000 = c(10, 24, 10, 8, 10, 10), bad_production_2000 = c(2, 1, 2, 1, 3, 5),all_production_2001 = 2*c(15, 25, 25, 10, 25, 18), good_production_2001 = 2*c(10, 24, 10, 8, 10, 10), bad_production_2001 = 2*c(2, 1, 2, 1, 3, 5))
Now you can reshape the data using the reshape
function in R.
Here, the variables "all_production","good_production","bad_production" are varying with time, and year values are changing for those variables.
So we specify v.names = c("all_production","good_production","bad_production")
.
df2 = reshape(df,direction="long",
v.names = c("all_production","good_production","bad_production"),
varying = names(df)[3:8],
idvar = c("company","division"),
timevar = "year",times = c(2000,2001))
For your data.frame you can specify times=2000:2017
and varying=3:ncol(df)
>df2
company division year all_production good_production bad_production
DLT.Marketing.2000 DLT Marketing 2000 15 2 10
DLT.CHANG1.2000 DLT CHANG1 2000 25 1 24
DLT.CAHNG2.2000 DLT CAHNG2 2000 25 2 10
MSF.MARKETING.2000 MSF MARKETING 2000 10 1 8
MSF.CHANG1M.2000 MSF CHANG1M 2000 25 3 10
MSF.CHANG2M.2000 MSF CHANG2M 2000 18 5 10
DLT.Marketing.2001 DLT Marketing 2001 30 4 20
DLT.CHANG1.2001 DLT CHANG1 2001 50 2 48
DLT.CAHNG2.2001 DLT CAHNG2 2001 50 4 20
MSF.MARKETING.2001 MSF MARKETING 2001 20 2 16
MSF.CHANG1M.2001 MSF CHANG1M 2001 50 6 20
MSF.CHANG2M.2001 MSF CHANG2M 2001 36 10 20
Now you can filter and sort like this:
library(dplyr)
somefunc<-function(company.name,yearval){
df2%>%filter(company==company.name,year==yearval)%>%arrange(-all_production)
}
>somefunc("DLT",2001)
company division year all_production good_production bad_production
1 DLT CHANG1 2001 50 2 48
2 DLT CAHNG2 2001 50 4 20
3 DLT Marketing 2001 30 4 20