R中字符变量的加权频率的时间序列图

时间:2018-12-03 09:26:00

标签: r plot time-series

让我们假设这个数据集:

answer <- c("a", "b", "b", NA, "a", "b", "a", "b", "a", NA, "a", "b")
weights <- c(0.1, 0.3, 0.2, 1.1, 0.3, 0.8, 0.9, 1.5, 0.9, 0.2, 0.15, 0.13)
year <- c(2001, 2005, 2010)
data <- cbind(answer,weights,year)

我想要一个时间序列图,其中显示可能答案(ab)的加权频率。 NA应该省略。
知道如何实现吗? 预先感谢!

如果我要重写我的问题,请告诉我。我是社区的新手...

1 个答案:

答案 0 :(得分:0)

欢迎您!您可以使用using System; using System.Collections.Generic; using System.Linq; using System.Text; using System.IO; namespace ConsoleApplication87 { class Program { const string FILENAME = @"c:\temp\test.txt"; static void Main(string[] args) { Item items = new Item(FILENAME); } } public class Item { public static List<Item> items = new List<Item>(); public string VehicleReferenceKey; public string DriverReferenceKey; public string Latitude; public Item() { } public Item(string filenam) { StreamReader reader = new StreamReader(filenam); string line = ""; Item newItem = null; while ((line = reader.ReadLine()) != null) { line = line.Trim(); if (line.Length > 0) { string[] rowItems = line.Split(new char[] { ':' }); switch (rowItems[0]) { case "VehicleReferenceKey" : newItem = new Item(); items.Add(newItem); newItem.VehicleReferenceKey = rowItems[1]; break; case "DriverReferenceKey": newItem.DriverReferenceKey = rowItems[1]; break; case "Latitude": newItem.Latitude = rowItems[1]; break; } } } } } } 对其进行整理以整理数据,并使用dplyr对其进行管理,所有这些都与ggplot2 dplyr(管道)包裹在magrittr链中操作员。阅读所有这些软件包,它们非常有用。

%>%

enter image description here

PS

我使用library(dplyr) library(ggplot2) data %>% # remove NAS filter(!is.na(answer)) %>% # group by group_by(answer, year) %>% # add a column made by the sums per year/answer: you can use other functions summarise(weights = sum(weights)) %>% # now the plot ggplot(.,aes(x = sprintf("%.0f", year), # sprintf to remove decimal to years y = weights, colour = answer, group = answer)) + geom_line() + # add lines labs( # rename x axis x = "summed weights" ) 代替了data.frame来存储数据,例如:

cbind