我有一个名为" yield"的数据集:
diff(yield$count)/yield[-nrow(yield),] * 100
我想确定哪种水果在2008年到2010年间变化率最大。我最接近的是:
fruits
但不仅影响我的year
和 private void MyFileSystemWatcher()
{
_watch = new FileSystemWatcher(path);
_watch.NotifyFilter = NotifyFilters.LastAccess | NotifyFilters.LastWrite | NotifyFilters.FileName | NotifyFilters.DirectoryName | NotifyFilters.Size;
_watch.Filter = "*.*";
_watch.IncludeSubdirectories = true;
_watch.Changed += new FileSystemEventHandler(_watch_Changed);
_watch.Created += new FileSystemEventHandler(_watch_Created);
_watch.EnableRaisingEvents = true;
}
private void _watch_Changed(object sender, FileSystemEventArgs e)
{
Console.WriteLine("File changed: {0}", e.Name);
}
private void _watch_Created(object sender, FileSystemEventArgs e)
{
Console.WriteLine("File created: {0}", e.Name);
}
列,结果也不正确。
答案 0 :(得分:1)
根据您的公式,我认为此dplyr
解决方案有效。您需要按水果分组,然后按年份排序,以使lag
正常工作:
library(dplyr)
yield %>%
group_by(fruits) %>%
arrange(fruits, year) %>%
mutate(rate = 100 * (count - lag(count))/lag(count)) %>%
ungroup()
# A tibble: 9 x 4
fruits year count rate
<fct> <int> <dbl> <dbl>
1 apples 2008 10.0 NA
2 apples 2009 13.0 30.0
3 apples 2010 7.00 - 46.2
4 oranges 2008 5.00 NA
5 oranges 2009 12.0 140
6 oranges 2010 14.0 16.7
7 pears 2008 16.0 NA
8 pears 2009 18.0 12.5
9 pears 2010 20.0 11.1
答案 1 :(得分:1)
为了完整性,这里与data.table
单行相同。
R> library(data.table)
R> df <- data.frame(fruits=rep(c("apples", "oranges", "pears"), each=3),
+ year=rep(2008:2010, 3),
+ count=c(10,13,7,5,12,14,16,18,20))
R> dt <- as.data.table(df)
R> dt
fruits year count
1: apples 2008 10
2: apples 2009 13
3: apples 2010 7
4: oranges 2008 5
5: oranges 2009 12
6: oranges 2010 14
7: pears 2008 16
8: pears 2009 18
9: pears 2010 20
R>
R> dt[ , .(year, change=100*(count-shift(count,1))/shift(count,1)), by=fruits]
fruits year change
1: apples 2008 NA
2: apples 2009 30.0000
3: apples 2010 -46.1538
4: oranges 2008 NA
5: oranges 2009 140.0000
6: oranges 2010 16.6667
7: pears 2008 NA
8: pears 2009 12.5000
9: pears 2010 11.1111
R>
我们将by=fruits
分组并在每个块中显示year
,并将所需的更改率设置为100*(current-prev)/prev
我们使用shift()
来延迟
count
系列。
答案 2 :(得分:0)
R基地一号班轮:
yield$roc <- with(yield, ave(count, fruits, FUN = function(x){c(0, c(diff(x), 0)/x)[1:length(x)]}))
以R为基数,如果您希望NA代替0和实际百分比(即* 100):
yield$roc <- with(yield, ave(count, fruits,
FUN = function(x){c(NA_real_, c(diff(x), 0)/x)[1:length(x)] * 100}))