这是我的data.frame :: beef
> head(beef)
YEAR....PBE CBE PPO CPO PFO DINC CFO RDINC RFP
1 1925 59.7 58.6 60.5 65.8 65.8 51.4 90.9 68.5 877
2 1926 59.7 59.4 63.3 63.3 68.0 52.6 92.1 69.6 899
3 1927 63 53.7 59.9 66.8 65.5 52.1 90.9 70.2 883
4 1928 71 48.1 56.3 69.9 64.8 52.7 90.9 71.9 884
5 1929 71 49.0 55.0 68.7 65.6 55.1 91.1 75.2 895
6 1930 74.2 48.2 59.6 66.1 62.4 48.8 90.7 68.3 874
和
dput(head(beef))
structure(list(YEAR....PBE = structure(1:6, .Label = c("1925 59.7",
"1926 59.7", "1927 63", "1928 71", "1929 71", "1930 74.2",
"1931 72.1", "1932 79", "1933 73.1", "1934 70.2",
"1935 82.2", "1936 68.4", "1937 73", "1938 70.2",
"1939 67.8", "1940 63.4", "1941 56"), class = "factor"),
CBE = c(58.6, 59.4, 53.7, 48.1, 49, 48.2), PPO = c(60.5,
63.3, 59.9, 56.3, 55, 59.6), CPO = c(65.8, 63.3, 66.8, 69.9,
68.7, 66.1), PFO = c(65.8, 68, 65.5, 64.8, 65.6, 62.4), DINC = c(51.4,
52.6, 52.1, 52.7, 55.1, 48.8), CFO = c(90.9, 92.1, 90.9,
90.9, 91.1, 90.7), RDINC = c(68.5, 69.6, 70.2, 71.9, 75.2,
68.3), RFP = c(877L, 899L, 883L, 884L, 895L, 874L)), .Names = c("YEAR....PBE",
"CBE", "PPO", "CPO", "PFO", "DINC", "CFO", "RDINC", "RFP"), row.names = c(NA,
6L), class = "data.frame")
我想根据其他变量为PBE创建一个多元线性回归模型。按照本link中的教程,我认为我应该执行以下代码:
> lm(formula = PBE ~ CBE + PBO + CPO + PFO +
+ DINC + CFO+RDINC+RFP+YEAR, data = beef)
eval(expr,envir,enclos)中的错误:找不到对象'PBE' 所以我决定尝试以下但是都有一些错误:
> lm(formula=PBE~YEAR,data=beef)
Error in eval(expr, envir, enclos) : object 'PBE' not found
> lm(formula=beef$PBE~beef$YEAR)
Error in model.frame.default(formula = beef$PBE ~ beef$YEAR, drop.unused.levels = TRUE) :
invalid type (NULL) for variable 'beef$PBE
你能否告诉我一些关于错字/错误所在的洞察力?
P.S。:我使用beef=read.table("beef.txt", header = TRUE, sep = "\t", comment.char="%")
读取文件,文件如下所示:
% http://lib.stat.cmu.edu/DASL/Datafiles/agecondat.html
%
% Datafile Name: Agricultural Economics Studies
% Datafile Subjects: Agriculture , Economics , Consumer
% Story Names: Agricultural Economics Studies
% Reference: F.B. Waugh, Graphic Analysis in Agricultural Economics,
% Agricultural Handbook No. 128, U.S. Department of Agriculture, 1957.
% Authorization: free use
% Description: Price and consumption per capita of beef and pork
% annually from 1925 to 1941 together with other variables relevant to
% an economic analysis of price and/or consumption of beef and pork
% over the period.
% Number of cases: 17
% Variable Names:
%
% PBE = Price of beef (cents/lb)
% CBE = Consumption of beef per capita (lbs)
% PPO = Price of pork (cents/lb)
% CPO = Consumption of pork per capita (lbs)
% PFO = Retail food price index (1947-1949 = 100)
% DINC = Disposable income per capita index (1947-1949 = 100)
% CFO = Food consumption per capita index (1947-1949 = 100)
% RDINC = Index of real disposable income per capita (1947-1949 = 100)
% RFP = Retail food price index adjusted by the CPI (1947-1949 = 100)
%
% The Data:
YEAR PBE CBE PPO CPO PFO DINC CFO RDINC RFP
1925 59.7 58.6 60.5 65.8 65.8 51.4 90.9 68.5 877
1926 59.7 59.4 63.3 63.3 68 52.6 92.1 69.6 899
1927 63 53.7 59.9 66.8 65.5 52.1 90.9 70.2 883
1928 71 48.1 56.3 69.9 64.8 52.7 90.9 71.9 884
1929 71 49 55 68.7 65.6 55.1 91.1 75.2 895
1930 74.2 48.2 59.6 66.1 62.4 48.8 90.7 68.3 874
1931 72.1 47.9 57 67.4 51.4 41.5 90 64 791
以下是Patrick建议View(beef)
的结果:
答案 0 :(得分:5)
您需要返回并查看将这些数据加载到R中的文件。 head()
的输出表明第一个变量为YEAR....PBE
且PBE
数据已与YEAR
变量合并,可能是由于使用的分隔符存在某些问题在您读入的文件中。返回并仔细检查文件。
从R中执行此操作的一种方法是使用count.fields()
,您可以通过文件名进行检查。请阅读?count.fields
,因为您可能需要设置sep
和quote
参数,以匹配您从中读取数据的文件。该函数将告诉您它找到了多少个字段(变量);将其与已知的变量数进行比较。
从您的编辑中可以清楚地看到,我上面描述的内容已经发生了:
> names(beef)
[1] "YEAR....PBE" "CBE" "PPO" "CPO" "PFO"
[6] "DINC" "CFO" "RDINC" "RFP"
该文件似乎并非全部/完全/真正以制表符分隔。我能够阅读你所包含的数据:
beef <- read.table("file.name", header = TRUE, sep = "", comment.char = "%")
> head(beef)
YEAR PBE CBE PPO CPO PFO DINC CFO RDINC RFP
1 1925 59.7 58.6 60.5 65.8 65.8 51.4 90.9 68.5 877
2 1926 59.7 59.4 63.3 63.3 68.0 52.6 92.1 69.6 899
3 1927 63.0 53.7 59.9 66.8 65.5 52.1 90.9 70.2 883
4 1928 71.0 48.1 56.3 69.9 64.8 52.7 90.9 71.9 884
5 1929 71.0 49.0 55.0 68.7 65.6 55.1 91.1 75.2 895
6 1930 74.2 48.2 59.6 66.1 62.4 48.8 90.7 68.3 874
> str(beef)
'data.frame': 7 obs. of 10 variables:
$ YEAR : int 1925 1926 1927 1928 1929 1930 1931
$ PBE : num 59.7 59.7 63 71 71 74.2 72.1
$ CBE : num 58.6 59.4 53.7 48.1 49 48.2 47.9
$ PPO : num 60.5 63.3 59.9 56.3 55 59.6 57
$ CPO : num 65.8 63.3 66.8 69.9 68.7 66.1 67.4
$ PFO : num 65.8 68 65.5 64.8 65.6 62.4 51.4
$ DINC : num 51.4 52.6 52.1 52.7 55.1 48.8 41.5
$ CFO : num 90.9 92.1 90.9 90.9 91.1 90.7 90
$ RDINC: num 68.5 69.6 70.2 71.9 75.2 68.3 64
$ RFP : int 877 899 883 884 895 874 791