我想在R中绘制分段数据。也就是说,我有格式
的数据| Product | Date | Origination | Rate | Num | Balance |
|-----------------------|--------|-------------|------|-----|-----------|
| DEMAND DEPOSITS | 200505 | 198209 | 0 | 1 | 2586.25 |
| DEMAND DEPOSITS | 200505 | 198304 | 0 | 1 | 3557.73 |
| DEMAND DEPOSITS | 200505 | 198308 | 0 | 1 | 14923.72 |
| DEMAND DEPOSITS | 200505 | 198401 | 0 | 1 | 4431.67 |
| DEMAND DEPOSITS | 200505 | 198410 | 0 | 1 | 44555.23 |
| MONEY MARKET ACCOUNTS | 200505 | 198209 | 0.25 | 2 | 65710.01 |
| MONEY MARKET ACCOUNTS | 200505 | 198211 | 0.25 | 2 | 41218.41 |
| MONEY MARKET ACCOUNTS | 200505 | 198304 | 0.25 | 1 | 61421.2 |
| MONEY MARKET ACCOUNTS | 200505 | 198402 | 0.25 | 1 | 13620.17 |
| MONEY MARKET ACCOUNTS | 200505 | 198408 | 0.75 | 1 | 281897.74 |
| MONEY MARKET ACCOUNTS | 200505 | 198410 | 0.25 | 1 | 5131.33 |
| NOW ACCOUNTS | 200505 | 198209 | 0 | 1 | 142744.35 |
| NOW ACCOUNTS | 200505 | 198303 | 0 | 1 | 12191.6 |
| SAVING ACCOUNTS | 200505 | 198301 | 0.25 | 1 | 96936.24 |
| SAVING ACCOUNTS | 200505 | 198302 | 0.25 | 2 | 21764 |
| SAVING ACCOUNTS | 200505 | 198304 | 0.25 | 1 | 14646.55 |
| SAVING ACCOUNTS | 200505 | 198305 | 0.25 | 1 | 20909.7 |
| SAVING ACCOUNTS | 200505 | 198306 | 0.25 | 1 | 66434.56 |
| SAVING ACCOUNTS | 200505 | 198309 | 0.25 | 1 | 20005.56 |
| SAVING ACCOUNTS | 200505 | 198404 | 0.25 | 2 | 16766.56 |
| SAVING ACCOUNTS | 200505 | 198407 | 0.25 | 1 | 47721.97 |
我想在Y轴上绘制一条线' Product'输入'余额'。在X轴上,我想提出“起源”。理想情况下,我也想设置颜色来区分线条。数据当前不是 data.frame 形式,所以如果我需要更改回来,请告诉我。
我已经无法在网上找到一个信息丰富的解决方案,即使我确定有。
谢谢,
答案 0 :(得分:1)
正如@ zx8754所说,你应该提供可重复的数据。 没有测试代码(因为没有可重现的数据),我建议如下,假设数据在data.frame'数据'中:
all_products <- unique(data$Product)
colors_use <- rainbow(length(all_products))
plot(y = data[data$Product == all_products[1],"Balance"],
x = data[data$Product == all_products[1],"Origination"],
type = "l",
col = colors_use[1],
ylim = c(min(data$Balance, na.rm = T),max(data$Balance, na.rm = T)),
xlim = c(min(data$Origination, na.rm = T),max(data$Origination, na.rm = T)))
for(i_product in 2:length(all_products)){
lines(y = data[data$Product == all_products[i_product],"Balance"],
x = data[data$Product == all_products[i_product],"Origination"],
col = colors_use[i_product])
}
答案 1 :(得分:1)
我没有足够的声誉来评论,所以我把它写成答案。为了缩短@ tobiasegli_te的答案,第一个plot
可以是plot(Balance~Origination,data=data,type='n')
,然后为lines
完成后续的i_product in 1:length(all_products)
。这样你就不必担心ylim
了。以下是使用Grunfeld数据的示例。
z <- read.csv('http://statmath.wu-wien.ac.at/~zeileis/grunfeld/Grunfeld.csv')
plot(invest~year,data=z,type='n')
for (i in unique(as.numeric(z$firm))) lines(invest~year,data=z,
subset=as.numeric(z$firm)==i, col=i)
另请注意,Origination
的间隔不均等。您需要将其更改为Date
或类似。
答案 2 :(得分:0)
我想你想要的东西如下:
df <- as.data.frame(df[c('Product', 'Balance', 'Origination')])
head(df)
Product Balance Origination
1 DEMAND DEPOSITS 2586.25 198209
2 DEMAND DEPOSITS 3557.73 198304
3 DEMAND DEPOSITS 14923.72 198308
4 DEMAND DEPOSITS 4431.67 198401
5 DEMAND DEPOSITS 44555.23 198410
6 MONEY MARKET ACCOUNTS 65710.01 198209
library(ggplot2)
library(scales)
ggplot(df, aes(Origination, Balance, group=Product, col=Product)) +
geom_line(lwd=1.2) + scale_y_continuous(labels = comma)
答案 3 :(得分:0)
我不确定你想要的是什么,是你在寻找什么?
假设您将数据放在data.txt中,删除管道并用&#39; _&#39;
替换名称中的空格d = read.table("data.txt", header=T)
prod.col = c("red", "blue", "green", "black" )
prod = unique(d$Product)
par(mai = c(0.8, 1.8, 0.8, 0.8))
plot(1, yaxt = 'n', type = "n", axes = TRUE, xlab = "Origination", ylab = "", xlim = c(min(d$Origination), max(d$Origination)), ylim=c(0, nrow(d)+5) )
axis(2, at=seq(1:nrow(d)), labels=d$Product, las = 2, cex.axis=0.5)
mtext(side=2, line=7, "Products")
for( i in 1:nrow(d) ){
myProd = d$Product[i]
myCol = prod.col[which(prod == myProd)]
myOrig = d$Origination[i]
segments( x0 = 0, x1 = myOrig, y0 = i, y1 = i, col = myCol, lwd = 5 )
}
legend( "topright", col=prod.col, legend=prod, cex=0.3, lty=c(1,1), bg="white" )