我的df是个人(行)的数据库,以及他们在一个活动中花费的数量(列)。我想在R中绘制一个具有以下特征的散点图:
x轴:log(花费的金额) y轴:log(花费此数量的人数)
这是我走了多远:
plot(log(df$Amount), log(df$???))
我该怎么做?谢谢!
我的df看起来像这样:
df
Name Surname Amount
John Smith 223
Mary Osborne 127
Mark Bloke 45
这就是我的想法(取自Chen(2012)的论文)
答案 0 :(得分:1)
试试这个:
library(dplyr)
library(scales) # To let you make plotted points transparent
# Make some toy data that matches your df's structure
set.seed(1)
df <- data.frame(Name = rep(letters, 4), Surname = rep(LETTERS, 4), Amount = rnorm(4 * length(LETTERS), 200, 50))
# Use dplyr to get counts of loans in each 5-dollar bin, then merge those counts back
# into the original data frame to use as y values in plot to come.
dfsum <- df %>%
mutate(Bins=cut(Amount, breaks=seq(round(min(Amount), -1) - 5, round(max(Amount) + 5, -1), by=5))) # Per AkhilNair's comment
group_by(Bins) %>%
tally() %>%
merge(df, ., all=TRUE)
# Make the plot with the new df with the x-axis on a log scale
with(dfsum, plot(x = log(Amount), y = n, ylab="Number around this amount", pch=20, col = alpha("black", 0.5)))