我尝试使用两列来访问表,然后将其输出到第三列。这是我写的用来访问它的函数:
getami <- function(bedroom, year){
ami <- hhsplus[bedroom + 1, year - 1997]
return(ami)
}
这就是我调用函数的方式
df$ami <- getami(df$beds, df$year)
床和年份只是两个整数的列表
这里是hhsplus看起来像的摘录:
1998 1999 2000 2001 2002
-------------------------------------------
1 54050 57800 60900 61100 67200
2 61750 66100 69600 69850 76800
3 69500 74350 78300 78550 86400
4 77200 82600 87000 87300 96000
5 83400 89200 93950 94300 103700
6 89550 95800 100900 101250 111350
7 95750 102400 107900 108250 119050
8 101900 109050 114850 115250 126700
当我将它存储到df $ ami时,它按降序显示。我想知道如何根据两列
来存储ami修改:This is what df$beds and df$year (actually df$dc) looks like
编辑2:这里是CSV格式的df摘录:
"","Date Listed","Price Listed","Date Closed","Price Closed","Days on Market","Age","Price/SF","SF","Beds","Baths","dc","ami"
"1",2013-05-30,1538000,2013-08-08,1480000,18,0,332,4460,7,6,2013,NA
"2",2014-05-15,2799000,2014-10-08,2300000,124,3,265,8691,7,8,2014,NA
"3",2014-03-14,1199888,2014-09-19,1200000,145,9,215,5586,7,6,2014,NA
"4",2016-03-28,3195000,2016-10-07,2800000,112,14,427,6562,7,6,2016,NA
"5",2010-05-25,2350000,2011-04-01,1925000,245,33,241,8000,6,12,2011,NA
"6",2013-11-15,2295000,2014-12-19,2183000,285,8,299,7300,6,8,2014,NA
"7",2015-05-05,1550000,2015-08-04,1550000,57,11,310,4993,6,6,2015,NA
"8",2014-02-21,2595000,2014-04-23,2520000,37,11,329,7651,6,7,2014,NA
"9",2013-08-12,3750000,2015-07-15,2640000,548,12,376,7030,6,5,2015,NA
"10",2009-09-16,2750000,2009-12-10,2525000,527,9,334,7550,6,6,2009,NA
"11",2013-05-27,1299000,2014-02-07,1350000,201,21,320,4217,6,5,2014,NA
"12",2015-02-07,2299000,2015-06-23,2240000,10,28,288,7783,6,8,2015,NA
"13",2014-05-16,1760000,2015-06-02,1700000,311,28,256,6650,6,5,2015,NA
"14",2012-02-24,749950,2012-04-27,740000,29,32,183,4045,6,3,2012,NA
"15",2013-01-25,1650000,2013-03-25,1600000,11,28,511,3133,6,6,2013,NA
"16",2014-02-16,1198000,2014-04-16,1150000,11,36,388,2964,6,5,2014,NA
"17",2014-04-04,1349950,2014-08-11,1340000,59,36,273,4904,6,4,2014,NA
"18",2017-06-04,1425000,2017-06-05,1425000,1,40,450,3166,6,4,2017,NA
"19",2009-05-08,1850000,2009-12-01,1500000,188,32,250,6000,6,4,2009,NA
"20",2014-03-14,1650000,2015-03-17,1480000,335,37,318,4660,6,4,2015,NA
"21",2013-06-12,2348000,2013-10-24,2025000,300,11,397,5100,6,5,2013,NA
"22",2016-01-25,1249000,2016-02-29,1125000,14,44,403,2792,6,4,2016,NA
"23",2011-08-22,580000,2011-11-08,575000,241,40,158,3636,6,5,2011,NA
"24",2011-07-25,599000,2011-09-14,570000,4,52,221,2576,6,4,2011,NA
"25",2010-06-26,1349000,2010-09-30,1300000,56,72,260,5000,6,4,2010,NA
"26",2016-09-09,1399000,2016-11-16,1410000,4,12,357,3948,6,5,2016,NA
编辑3:dput(head(df,10))
structure(list(`Date Listed` = structure(c(1369872000, 1400112000,
1394755200, 1459123200, 1274745600, 1384473600, 1430784000, 1392940800,
1376265600, 1253059200), class = c("POSIXct", "POSIXt"), tzone = "UTC"),
`Price Listed` = c(1538000, 2799000, 1199888, 3195000, 2350000,
2295000, 1550000, 2595000, 3750000, 2750000), `Date Closed` = structure(c(1375920000,
1412726400, 1411084800, 1475798400, 1301616000, 1418947200,
1438646400, 1398211200, 1436918400, 1260403200), class = c("POSIXct",
"POSIXt"), tzone = "UTC"), `Price Closed` = c(1480000, 2300000,
1200000, 2800000, 1925000, 2183000, 1550000, 2520000, 2640000,
2525000), `Days on Market` = c(18, 124, 145, 112, 245, 285,
57, 37, 548, 527), Age = c(0, 3, 9, 14, 33, 8, 11, 11, 12,
9), `Price/SF` = c(332, 265, 215, 427, 241, 299, 310, 329,
376, 334), SF = c(4460, 8691, 5586, 6562, 8000, 7300, 4993,
7651, 7030, 7550), Beds = c(7, 7, 7, 7, 6, 6, 6, 6, 6, 6),
Baths = c(6, 8, 6, 6, 12, 8, 6, 7, 5, 6), dc = c(2013, 2014,
2014, 2016, 2011, 2014, 2015, 2014, 2015, 2009), ami = c(NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA)), .Names = c("Date Listed",
"Price Listed", "Date Closed", "Price Closed", "Days on Market",
"Age", "Price/SF", "SF", "Beds", "Baths", "dc", "ami"), row.names = c(NA,
-10L), class = c("tbl_df", "tbl", "data.frame"))
答案 0 :(得分:0)
如果您只是想将数据转换为平面文件,可以使用gather
包中的tidyr
:
library(tidyr)
df = read.table(text=" bedroom 1998 1999 2000 2001 2002
1 54050 57800 60900 61100 67200
2 61750 66100 69600 69850 76800
3 69500 74350 78300 78550 86400
4 77200 82600 87000 87300 96000
5 83400 89200 93950 94300 103700
6 89550 95800 100900 101250 111350
7 95750 102400 107900 108250 119050
8 101900 109050 114850 115250 126700", header = TRUE)
answer = gather(data = df, key = "year", value = "hhsplus", X1998:X2002)
请注意,我从您的示例数据创建数据集的方式,所有年份列现在都在前面有“X”。以下是解决问题的方法:
answer$year = as.numeric(gsub("X", "", answer$year))
结果:
bedroom year hhsplus
1 1998 54050
2 1998 61750
3 1998 69500
4 1998 77200
5 1998 83400
6 1998 89550
7 1998 95750
8 1998 101900
1 1999 57800
...
答案 1 :(得分:0)
我会通过合并两个数据帧来解决这个问题。您可以将hhsplus
转换为长格式来完成此操作。请参阅下面的代码。
但是,我不清楚你想要如何合并两个数据帧。在你的函数中,你有hhsplus[bedroom + 1, year - 1997]
,为什么你在卧室加1,从1997年减去1997?
require("tidyr")
# From lebelinoz's answer, read in hhsplus:
hhsplus = read.table(text=" bedroom 1998 1999 2000 2001 2002
1 54050 57800 60900 61100 67200
2 61750 66100 69600 69850 76800
3 69500 74350 78300 78550 86400
4 77200 82600 87000 87300 96000
5 83400 89200 93950 94300 103700
6 89550 95800 100900 101250 111350
7 95750 102400 107900 108250 119050
8 101900 109050 114850 115250 126700", header = TRUE)
# convert hhsplus to long format:
ncols = ncol(hhsplus)
hhsplus_long = gather(data = hhsplus, year, hhsplus_ami, -1)
hhsplus_long$year = gsub("X", "", hhsplus_long$year)
hhsplus_long$bedroom = hhsplus_long$bedroom - 1
# merge two data frames, keeping all records from df (all.x=TRUE)
merge(df, hhsplus_long, by.x = c("Beds", "dc"), by.y=c("bedroom", "year"), all.x=TRUE)