我有以下关于投资回报的数据集:
df <- structure(list(Date = structure(c(1620604800, 1620604800, 1620604800,
1620604800, 1620604800, 1627257600, 1627257600), class = c("POSIXct",
"POSIXt"), tzone = "UTC"), Client = c(56203, 56203, 56203, 56203,
56203, 56203, 56203), FundName = c("SAFRA KEPLER EQUITY HEDGE FI MULTIMERCADO",
"SAFRA KEPLER FI MULTIMERCADO", "SAFRA S&P SPECIAL FIC MULTIMERCADO",
"SAFRA AÇÕES LIVRE FIC AÇÕES", "SAFRA CONSUMO AMERICANO FIC AÇÕES BDR NÍVEL I",
"SAFRA S&P SPECIAL FIC MULTIMERCADO", "SAFRA CONSUMO AMERICANO FIC AÇÕES BDR NÍVEL I"
), Nature = c("Aplicação", "Aplicação", "Aplicação", "Aplicação",
"Aplicação", "Resgate", "Resgate"), Quantity = c(145.927569,
62.684383, 16.964545, 57.852278, 14.900635, 0, 0), Value = c(21240,
12120, 8760, 9120, 8760, -9157.45, -1125), Saldo = c(21240, 12120,
8760, 9120, 8760, 55.60576570794, 9198.89221970512), FundCode = c(25079578000106,
21144577000147, 19107923000175, 32666326000149, 19436818000180,
19107923000175, 19436818000180), NAVInitialDate = c(145.551660000114,
193.349593000021, 521.980031999759, 157.489282999886, 639.441026999615,
543.077092000283, 692.849144999869), NAVYesterday = c(147.526190999895,
196.017907999922, 531.332205999643, 166.330892999889, 675.641246000305,
531.332205999643, 675.641246000305)), row.names = c(NA, -7L), class = c("tbl_df",
"tbl", "data.frame"))
df$NAVInitialDate = df$Value / df$Quantity
“NAVInitialDate”列是每行“价值/数量”的结果。 我想对 R 说的是:“如果在一行中我们有 Quantity = 0,那么 'NAVInitialDate' = ABS(Value) 除以与行中 Fundname 具有相同 FundName 的所有行的数量总和数量等于0"
结果应该是,对于第 6 行 > NavInitialDate = 539,793 ; 对于第 7 行 > NavInitialDate = 75,50
我该怎么做?我想它可能需要 3 个条件
答案 0 :(得分:2)
我们可以在组内的所有 group_by
Quantity mutate
0 中使用 dplyr、case_when
FundName 和 ==
NAVInitialDate。
library(dplyr)
df%>%group_by(FundName)%>%
mutate(NAVInitialDate=case_when(Quantity==0 ~ abs(Value)/sum(Quantity, na.rm=TRUE)))
# A tibble: 7 x 10
# Groups: FundName [5]
Date Client FundName Nature Quantity Value Saldo FundCode NAVInitialDate
<dttm> <dbl> <chr> <chr> <dbl> <dbl> <dbl> <dbl> <dbl>
1 2021-05-10 00:00:00 56203 SAFRA KEPLER … Aplic… 146. 21240 2.12e4 2.51e13 NA
2 2021-05-10 00:00:00 56203 SAFRA KEPLER … Aplic… 62.7 12120 1.21e4 2.11e13 NA
3 2021-05-10 00:00:00 56203 SAFRA S&P SPE… Aplic… 17.0 8760 8.76e3 1.91e13 NA
4 2021-05-10 00:00:00 56203 SAFRA AÇÕES L… Aplic… 57.9 9120 9.12e3 3.27e13 NA
5 2021-05-10 00:00:00 56203 SAFRA CONSUMO… Aplic… 14.9 8760 8.76e3 1.94e13 NA
6 2021-07-26 00:00:00 56203 SAFRA S&P SPE… Resga… 0 -9157. 5.56e1 1.91e13 540.
7 2021-07-26 00:00:00 56203 SAFRA CONSUMO… Resga… 0 -1125 9.20e3 1.94e13 75.5
# … with 1 more variable: NAVYesterday <dbl>
答案 1 :(得分:2)
试试:
sum_0 <- sum(df[df$FundName%in%(df[df$Quantity==0,]$FundName),]$Quantity)
df$NAVInitialDate <- ifelse(df$Quantity == 0, abs(df$Value)/sum_0, df$Value / df$Quantity)
答案 2 :(得分:1)
试试
library(tidyverse)
df <- df %>% group_by(FundName) %>%
mutate(sum_Quantity = sum(Quantity),
mycol = ifelse(Quantity==0, abs(Value)/sum_Quantity, Value/Quantity))
在这里您按 FundName 分组并应用一个简单的 mutate
命令。
为了检查,让我们看看感兴趣的行/列:
df %>% filter(Quantity==0) %>% select(FundName, Quantity, NAVInitialDate, mycol)
# A tibble: 2 x 4
# Groups: FundName [2]
FundName Quantity NAVInitialDate mycol
<chr> <dbl> <dbl> <dbl>
1 SAFRA S&P SPECIAL FIC MULTIMERCADO 0 -Inf 540.
2 SAFRA CONSUMO AMERICANO FIC AÇÕES BDR NÍVEL I 0 -Inf 75.5