我有以下示例数据框:
jan feb mar apr may jun jul aug sep oct nov dec someValue
0 0 0 0 0 0 0 1 0 0 0 0 109.24673
0 0 0 0 0 1 1 1 1 0 0 0 108.24444
0 0 0 0 0 0 1 1 1 1 0 0 247.25433
0 0 0 0 0 0 1 1 1 1 0 0 192.22873
我现在想创建一个散点图,每个月在x轴上记录一次。 “ someValue”列应为y轴。 对于月份列中的每个“ 1”,应在散点图的适当部分创建一个点。应该忽略每个“ 0”,并且在图中不可见。
我如何在R中做到这一点?谢谢!
答案 0 :(得分:0)
诀窍是将数据转换为正确的形状,即通过以下方式转换为长格式: gather
中的tidyr
。试试这个:
library(dplyr)
library(tidyr)
library(ggplot2)
df <- tribble(
~jan, ~feb, ~mar, ~apr, ~may, ~jun, ~jul, ~aug, ~sep, ~oct, ~nov, ~dec, ~someValue,
0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 109.24673,
0, 0, 0, 0, 0, 1, 1, 1, 1, 0, 0, 0, 108.24444,
0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 0, 0, 247.25433,
0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 0, 0, 192.22873
)
months <- names(df)[grepl("^\\w{3}$", names(df))]
df_gather <- df %>%
gather(month, value, -someValue) %>%
mutate(
# Convert to factor and set order of months
month = factor(month, levels = months),
# Set "0" to missing
someValue = ifelse(value == 0, NA, someValue))
ggplot(df_gather, aes(month, someValue)) +
geom_point()
#> Warning: Removed 35 rows containing missing values (geom_point).
由reprex package(v0.3.0)于2020-03-14创建
答案 1 :(得分:0)
假设您的数据称为df:
df = structure(list(jan = c(0L, 0L, 0L, 0L), feb = c(0L, 0L, 0L, 0L
), mar = c(0L, 0L, 0L, 0L), apr = c(0L, 0L, 0L, 0L), may = c(0L,
0L, 0L, 0L), jun = c(0L, 1L, 0L, 0L), jul = c(0L, 1L, 1L, 1L),
aug = c(1L, 1L, 1L, 1L), sep = c(0L, 1L, 1L, 1L), oct = c(0L,
0L, 1L, 1L), nov = c(0L, 0L, 0L, 0L), dec = c(0L, 0L, 0L,
0L), someValue = c(109.24673, 108.24444, 247.25433, 192.22873
)), class = "data.frame", row.names = c(NA, -4L))
您可以在基数R(过去很好)中做到这一点:
ind = which(df[,1:12]==1,arr.ind=TRUE)
row col
[1,] 2 6
[2,] 2 7
[3,] 3 7
[4,] 4 7
[5,] 1 8
[6,] 2 8
[7,] 3 8
[8,] 4 8
[9,] 2 9
[10,] 3 9
[11,] 4 9
[12,] 3 10
[13,] 4 10
所以要绘制的是x上的列号和someValue的对应行值,然后将其放入data.frame
plotdf = data.frame(x=ind[,"col"],
y=df$someValue[ind[,"row"]])
x y
1 6 108.2444
2 7 108.2444
3 7 247.2543
4 7 192.2287
5 8 109.2467
6 8 108.2444
7 8 247.2543
8 8 192.2287
9 9 108.2444
10 9 247.2543
11 9 192.2287
12 10 247.2543
13 10 192.2287
您可以只使用基数R(过去很好):
plot(plotdf,xlim=c(1,12),xaxt="n",ylab="somevalue")
months = colnames(df)[1:12]
axis(1,at=1:12,labels=months)
或者,如果您喜欢一些精美的图形,可以使用plotdf,我们之前定义的plotdf和月份:
library(plotly)
plot_ly(x=plotdf$x,y=plotdf$y,type="scatter") %>%
layout(xaxis=list(range=c(0,13),tickvals=1:12,
dtick=1,ticktext = months))