Question

我是R的新手，我需要帮助从我的数据集中获取一些值。该信息是城市列表中每年的美元金额。我正在尝试设置我的值，以便我可以对整个数据集名称估计运行线性回归模型。

estimate <- read.csv("estimate.csv", check.names = FALSE) #Import
estimate

location  2010  2011  2012  2013  2014
city1     200   250   300   500   600
city2     300   300   400   650   780
city3     500   600   700   800   900

我只对多年来的city3数据感兴趣。

我知道我可以使用代码years <- c(2010,2011,2012,2013,2014)来创建我的年份变量，但我知道这只适用于小表。

对于我的线性模型，我想首先plot(years, values)，其中年份是第2列：6，而对应的值仅来自第3行。当我运行values <- estimate[3, c(3,2:6]时，我得到了值的数据，但是当我尝试为years <- estimate[0, c(0,2:6)]做同样的事情时，我得到一个包含5个变量的0对象。试图绘制给我的情节

Error in plot.window(...) : need finite 'xlim' values In addition: Warning messages: 1: In min(x) : no non-missing arguments to min; returning Inf 2: In max(x) : no non-missing arguments to max; returning -Inf 3: In min(x) : no non-missing arguments to min; returning Inf 4: In max(x) : no non-missing arguments to max; returning -In

理想情况下，我希望数据设置在哪里：

years        values
2010         500
2011         600
2012         700
2013         800
2014         900

然后我可以运行lm函数。提前谢谢。我在R和Stack上的这些东西真的很新，所以请原谅我的新生儿。

Answer 1

1）提取假设最后在Note中可重复显示的数据我们可以执行这样的回归：

year <- as.numeric(names(estimate)[-1])
city3 <- unlist((estimate[3, -1]))
lm(city3 ~ year)

2）融化或者我们可以将estimate转换为长格式，此处为15x3，然后修复名称并生成年份数字，然后执行回归：

library(reshape2)

long <- melt(estimate, id = "Location")
names(long) <- c("Location", "Year", "Estimate")
long$Year <- as.numeric(as.character(long$Year))

lm(Estimate ~ Year, long, subset = Location == "city3")

2a）重塑也可以在没有任何这样的软件包的情况下从宽格式转换为长格式：

yrs <- names(estimate)[-1]
long <- reshape(estimate, dir = "long", idvar = "Location", 
  varying = list(yrs), times = as.numeric(yrs), timevar = "Year", v.names = "Estimate")

lm(Estimate ~ Year, long, subset = Location == "city3")

注意：

Lines <- " Location,2010,2011,2012,2013,2014 city1,200,250,300,500,600 city2,300,300,400,650,780 city3,500,600,700,800,900" estimate <- read.csv(text = Lines, check.names = FALSE)

Answer 2

当您使用listcomp读取csv文件时，第一行将成为数据框中的名称。尝试

iter

您会看到read.csv是一个字符向量names = colnames(estimate)。您可以通过删除第一项并转换为数字来将其转换为names：

c("location", "2010", "2011", ...)

R线性回归 - 故障设定值

2 个答案: