数据表几列的散点图

时间:2019-07-02 11:21:08

标签: r datatable scatter-plot

在具有20行和6列的数据表中,我想绘制第2列和第4列的散点图。第1列是从0到19的行ID。 在表格说明中,第1列和第2列的系数为20,第4列的数字为

我已经尝试将具有as.factor的单列转换为单个数据文件,然后合并在一起并使用ggplot进行绘图。这对我不起作用。

BasinSize <- as.factor(Table_Barrow20$`Lake Size`) #column2 of table
Basinheight <- as.factor(Table_Barrow20$`Lake Mean`) #column 4 of table
scatterdata <- merge(Basinheight, BasinSize)

plot(scatterdata)
ggplot(scatterdata, aes(x=Basinheight, y=BasinSize), col=c("33FF00")) + 
  geom_point(shape=18) 

问题是,通过将所有20个值与20个值合并而不是按ID合并,这两个列以错误的方式合并在一起。

这是从.txt复制的表

“名称”“湖泊大小”“最大”“平均值”“中度”“最小” “ 1”“ 0”“ 2419723” 9.37238597869873 6.85431201700351 6.79038763046265 5.5892276763916 “ 2”“ 1”“ 737345” 2.20990252494812 1.17229168051113 1.16918420791626 0.532729208469391 “ 3”“ 2”“ 1904419” 6.97486448287964 6.29653060932372 6.29239559173584 5.74258995056152 “ 4”“ 3”“ 633220” 2.94963598251343 0.693283292837505 0.566755801439285 -1.04891955852509 “ 5”“ 4”“ 3417157” 2.02893280982971 1.04370415649172 1.16990214586258 -0.615132451057434 “ 6”“ 5”“ 3046643” 2.39258670806885 0.612889545533382 0.621234953403473 -2.27862739562988 “ 7”“ 6”“ 3868608” 16.8747043609619 15.986930145805 15.9581031799316 14.7309837341309 “ 8”“ 7”“ 11952064” 4.12359857559204 3.50676135545307 3.50302672386169 2.70154309272766 “ 9”“ 8”“ 2431961” 6.02156400680542 4.79594737052494 4.82516670227051 3.39997673034668 “ 10”“ 9”“ 5624563” 7.80270195007324 6.76836155530465 6.72958827018738 5.68962478637695 “ 11”“ 10”“ 2430490” 4.87959337234497 3.43340588038286 3.3837513923645 2.91182518005371 “ 12”“ 12”“ 1436097” 3.67803716659546 2.49129957226396 2.47576546669006 1.17649579048157 “ 13”“ 13”“ 791941” 5.25690269470215 4.07207433426663 4.07166481018066 3.61373019218445 “ 14”“ 14”“ 3013737” 1.69542956352234 0.756966933677959 0.755697637796402 -2.0527184009552 “ 15”“ 15”“ 2594511” 5.87903642654419 2.43693244171563 2.44506788253784 0.725884079933167 “ 16”“ 16”“ 3105136” 12.6303310394287 9.71669491262446 9.67505931854248 8.92571830749512 “ 17”“ 17”“ 1985544” 9.32382488250732 8.25899538392204 8.30398368835449 6.08988952636719 “ 18”“ 18”“ 1800122” 12.424147605896 8.48729049871582 8.50036954879761 7.7384238243103 “ 19”“ 19”“ 2753803” 16.724292755127 15.7803085039918 15.7673816680908 14.8390283584595 “ 20”“ 11”“ 765907” 3.45813465118408 2.61115002320832 2.59490370750427 2.17101335525513

2 个答案:

答案 0 :(得分:0)

由于这两个字段来自同一数据表,因此可能不需要分别合并它们,您可以在ggplot调用中将它们调用,即:

Table_Barrow20 = data.frame(LakeSize = rnorm(50,2),
               LakeMean = rnorm(50, 3))

 ggplot(Table_Barrow20, aes(x=LakeMean, y=LakeSize), col=c("33FF00")) + 
     geom_point(shape=18) 

您可能还需要考虑将数字强制转换为.numeric(),因为散点图并不是显示因子类型数据的最佳方法,并且您显示的数据看起来是连续的。

答案 1 :(得分:0)

我认为您不应该使用因数,因为使用geom_point时,您需要x和y的数值特征

这应该有效:

ggplot(Table_Barrow20, aes(x=as.numeric(`Lake Size`), y=as.numeric(`Lake Mean`)), col=c("33FF00")) + 
  geom_point(shape=18) 

如果您有多个geom_point,请更新,可以将数据框整理为整齐的格式:

library(tidyverse)
Table_Barrow20  <- data.frame(Buffer.Mean=c(1,2,4),Buffer.Size=c(10,30,20),Lake.Mean=c(3,1,4),Lake.Size=c(15,25,12))
# select columns about buffer and add type variable
df1 <- Table_Barrow20 %>% 
          select(Buffer.Mean,Buffer.Size) %>% 
          rename(Mean=Buffer.Mean,Size=Buffer.Size) %>% 
          mutate(type="Buffer") %>% 
          mutate(Size=as.numeric(Size), Mean=as.numeric(Mean))
# select columns about lake and add type variable
df2 <- Table_Barrow20 %>% 
          select(Lake.Mean,Lake.Size) %>% 
          rename(Mean=Lake.Mean,Size=Lake.Size) %>% 
          mutate(type="Lake") %>% 
          mutate(Size=as.numeric(Size), Mean=as.numeric(Mean))
# bind the two dataframe to make one with all lines
df_tot <- rbind(df1,df2) 
# add "type" column as color so the colors will be different for lake and buffer
ggplot(df_tot, aes(x=Size, y=Mean, col=type)) + 
  geom_point(shape=18) 

enter image description here