我正在尝试创建以下公式:
Interest expense / (Total Debt(for all years)) / # number of years
数据如下:
GE2017 GE2016 GE2015 GE2014
Interest Expense -2753000 -2026000 -1706000 -1579000
Long Term Debt 108575000 105080000 144659000 186596000
Short/Current Long Term Debt 134591000 136211000 197602000 261424000
Total_Debt 243166000 241291000 342261000 448020000
GOOG2017 GOOG2016 GOOG2015 GOOG2014
Interest Expense -109000 -124000 -104000 -101000
Long Term Debt 3943000 3935000 1995000 2992000
Short/Current Long Term Debt 3969000 3935000 7648000 8015000
Total_Debt 7912000 7870000 9643000 11007000
NVDA2018 NVDA2017 NVDA2016 NVDA2015
Interest Expense -61000 -58000 -47000 -46000
Long Term Debt 1985000 1985000 7000 1384000
Short/Current Long Term Debt 2000000 2791000 1434000 1398000
Total_Debt 3985000 4776000 1441000 2782000
也就是说,对于GE
,我正在尝试将最近一年-2753000
的利息费用除以Total Debt
所有4年的平均值GE
所以;
-2753000 / AVERAGE(243166000 + 241291000 + 342261000 + 448020000) = 0.0086
但是,从group_by()
开始取平均值时,我遇到了GE
的问题,并且由于年份不同,其他公司的列名也不同。
cost_of_debt %>%
t() %>%
data.frame() %>%
rownames_to_column('rn') %>%
group_by(rn)
#Calcualtion here
第二;如果可能的话,我想进行与上述相同的计算,但只使用每个公司的最后两年。
-2753000 / AVERAGE(243166000 + 241291000) = 0.01136
也许grepl
函数可以在这里工作吗?
我有一个称为symbols
的向量。
symbols <- c("NVDA", "GOOG", "GE")
数据:
cost_of_debt <- structure(list(GE2017 = c(-2753000, 108575000, 134591000, 243166000
), GE2016 = c(-2026000, 105080000, 136211000, 241291000), GE2015 = c(-1706000,
144659000, 197602000, 342261000), GE2014 = c(-1579000, 186596000,
261424000, 448020000), GOOG2017 = c(-109000, 3943000, 3969000,
7912000), GOOG2016 = c(-124000, 3935000, 3935000, 7870000), GOOG2015 = c(-104000,
1995000, 7648000, 9643000), GOOG2014 = c(-101000, 2992000, 8015000,
11007000), NVDA2018 = c(-61000, 1985000, 2e+06, 3985000), NVDA2017 = c(-58000,
1985000, 2791000, 4776000), NVDA2016 = c(-47000, 7000, 1434000,
1441000), NVDA2015 = c(-46000, 1384000, 1398000, 2782000)), .Names = c("GE2017",
"GE2016", "GE2015", "GE2014", "GOOG2017", "GOOG2016", "GOOG2015",
"GOOG2014", "NVDA2018", "NVDA2017", "NVDA2016", "NVDA2015"), row.names = c("Interest Expense",
"Long Term Debt", "Short/Current Long Term Debt", "Total_Debt"
), class = "data.frame")
答案 0 :(得分:1)
对于第一种情况,在将行名称创建为列(rownames_to_column
-从tibble
开始)后,通过在行的开头之间的交界处将其分隔为“ firm”和“ year”。 “年份”和公司的末尾名称(按“公司”分组)通过使用“ Interest.Expense”的比例与{Total_Debt”的mean
值来创建“ New”列。然后,我们可以arrange
按“年”,为每个“公司”获取最后两个“总计债务”中的mean
,然后除以“ Interest.Expense”
library(dplyr)
cost_of_debt %>%
t() %>%
data.frame() %>%
rownames_to_column('rn') %>%
separate(rn, into = c("firm", "year"),
"(?<=[A-Z])(?=[0-9])", convert = TRUE) %>%
group_by(firm) %>%
mutate(New = Interest.Expense/mean(Total_Debt)) %>%
arrange(firm, year) %>%
mutate(NewLast = Interest.Expense/mean(tail(Total_Debt, 2)))
答案 1 :(得分:1)
我认为您需要首先清除数据,以便更容易理解什么是观察值和什么是变量。 Google整洁的数据:)这是我的解决方案。首先,我要使数据整洁,然后进行简单的计算。
library(tidyverse)
library(stringr)
), class = "data.frame")
# Clean and make the data tidy
cost_of_debt <- cost_of_debt %>%
as_tibble() %>%
rownames_to_column(var = "indicator") %>%
mutate(indicator = str_replace_all(indicator, regex("\\s|\\/"), "_")) %>%
gather(k, value, -indicator) %>%
separate(k, into = c("company", "year"), -4) %>%
spread(indicator, value) %>%
rename_all(tolower)
结果数据如下:
company year interest_expense long_term_debt short_current_long_term_debt total_debt
<chr> <chr> <dbl> <dbl> <dbl> <dbl>
1 GE 2014 -1579000 186596000 261424000 448020000
2 GE 2015 -1706000 144659000 197602000 342261000
3 GE 2016 -2026000 105080000 136211000 241291000
4 GE 2017 -2753000 108575000 134591000 243166000
5 GOOG 2014 -101000 2992000 8015000 11007000
然后我们可以回答您的问题:
cost_of_debt <- cost_of_debt %>%
group_by(company) %>%
mutate(int_over_totdept4 = interest_expense / mean(total_debt),
int_over_totdept2 = interest_expense / mean(total_debt[year %in% c("2017", "2016")]))
哪个给出了数据框(新变量最右侧):
company year interest_expense long_term_debt short_current_long_term_debt total_debt int_over_totdept4 int_over_totdept2
<chr> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 GE 2014 -1579000 186596000 261424000 448020000 -0.00495 -0.00652
2 GE 2015 -1706000 144659000 197602000 342261000 -0.00535 -0.00704
3 GE 2016 -2026000 105080000 136211000 241291000 -0.00636 -0.00836
4 GE 2017 -2753000 108575000 134591000 243166000 -0.00864 -0.0114
5 GOOG 2014 -101000 2992000 8015000 11007000 -0.0111 -0.0128
并且如果您想总结问题的形式,
# First question:
cost_of_debt %>% filter(company == "GE", year == "2017") %>% select(company, year, int_over_totdept4)
# Second question:
cost_of_debt %>% filter(year == "2017") %>% select(company, year, int_over_totdept2)