我正在开发一个类似于this
的数据框以下是它的样子:
require: false
依此类推......直到635行返回。
与我要比较的其他数据集可以找到here
以下是它的样子:
shape id day hour week id footfall category area name
22496 22/3/14 3 12 634 Work cluster CBD area 1
22670 22/3/14 3 12 220 Shopping cluster Orchard Road 1
23287 22/3/14 3 12 723 Airport Changi Airport 2
16430 22/3/14 4 12 947 Work cluster CBD area 2
4697 22/3/14 3 12 220 Residential area Ang Mo Kio 2
4911 22/3/14 3 12 1001 Shopping cluster Orchard Rd 3
11126 22/3/14 3 12 220 Residential area Ang Mo Kio 2
以及this我要与其category Foreigners Locals
Work cluster 1600000 3623900
Shopping cluster 1800000 3646666.667
Airport 15095152 8902705
Residential area 527700 280000
第一个和第二个共享相同的属性,即previousHour
&第一个和第三个数据集共享相同的属性category
。
基于hour
的{{1}}。例如,previousHour
here
category
应如下所示:
workcluster
直到144行返回...每个类别。
点击previousHour
类别的here
hour
0
3
4
4
4
5
例如。对于shopping
,应该如下所示:
previousHour
直到144行返回...
点击shopping
类别的here
点击hour
0
3
3
4
4
5
类别的here
所有144行返回...
airport
数据集:
residential
这是我理想在R中找到的东西:
SumHour
我不知道该怎么做,这就是我的尝试:
category sumHour
1 Airport 2208
2 Residential area 1656
3 Shopping cluster 1656
4 Work cluster 1656
但 newtbl ......
没有任何反应以下是 #for n in 1: number of rows{
# calculate sumHours(in SumHours dataset) - previousHour = newHourSum and store it as newHourSum
# calculate hour/(newHourSum-previousHour) * Foreigners and store it as footfallHour
# add to the empty dataframe }
的理想:
mergetbl <- function(tbl1, tbl2)
{
newtbl = data.frame(hour=numeric(),forgHour=numeric())
ntbl1rows<-nrow(tbl1) # get the number of rows
for(n in 1:ntbl1rows)
{
#for n in 1: number of rows{
# check the previous hour from IDA dataset !!!!
# calculate sumDate - previousHour = newHourSum and store it as newHourSum
# calculate hour/(newHourSum-previousHour) * Foreigners and store it as footfallHour
# add to the empty dataframe }
newHourSum <- 3588 - tbl1
footfallHour <- (tbl1$hour/(newHourSum-previousHour)) * tbl2$Foreigners
newtbl <- rbind(newtbl, footfallHour)
}
}
依旧......
答案 0 :(得分:2)
根据向量进行思考得出:
试试这个:
### this is to get your Foreigners/Locals to be at the same size as tbl1
Foreigners=ifelse(tbl1$category=="Work cluster",tbl2$Foreigners[1], ifelse (tbl1$category=="Shopping cluster", tbl2$Foreigners[2], ifelse(tbl1$category=="Airport", tbl2$Foreigners[3], tbl2$Foreigners[4])))
Locals=ifelse(tbl1$category=="Work cluster",tbl2$Locals[1], ifelse (tbl1$category=="Shopping cluster", tbl2$Locals[2], ifelse(tbl1$category=="Airport", tbl2$Locals[3], tbl2$Locals[4])))
现在,功能
resultHour = function(tbl1, tbl2, ForeOrLoca)
{
previousHour = rep (0, nrow(tbl1))
for (i in 2:nrow(tbl1))
{
previousHour[i] = tbl1$hour[i-1]
}
### The conditional sum matching the category from tbl1
NewHourSum = ifelse(tbl1$category=="Work cluster",sum(with(tbl1, hour*I(category == "Work cluster"))), ifelse (tbl1$category=="Shopping cluster", sum(with(tbl1, hour*I(category == "Shopping cluster"))), ifelse(tbl1$category=="Airport", sum(with(tbl1, hour*I(category == "Airport"))), sum(with(tbl1, hour*I(category == "Residential area"))))))
##and finally, this
hour = as.vector(tbl1$hour)
footfallHour <- (hour/(newHourSum - previousHour)) * ForeOrLoca
newtbl <- cbind(hour, footfallHour)
return (newtbl)
}
这是我得到的输出:
> head(newtbl)
hour footfallHour
[1,] 3 1337.7926
[2,] 3 1506.2762
[3,] 3 12631.9264
[4,] 4 1785.2162
[5,] 3 441.7132
[6,] 3 1506.2762
使用该功能:
TheResultIWant = resultHour (tbl1,tbl2)
答案 1 :(得分:0)
对于你的新问题。
如果您将数据框切割成仅包含一个类别的数据框,则可以使用此功能:
new_function_hour_result = function (tbl1_categ, vec_categ, prevHour_Categ, sumHour_Categ)
hour = as.vector(tbl1_categ$hour)
footfallHour <- (hour/(sumHour_Categ- previousHour)) * vec_categ
newtbl <- cbind(hour, footfallHour)
return (newtbl)
}
使用tbl1_categ
给定类别的数据框,vec_categ
您的外国人或给定类别的本地数据,prevHour_Categ
给定类别的previousHour,最后{{1} }给定类别的sumHour。
要让您的矢量与df相同,请将它们与以下内容进行比较:
例如,本地/机场类别中sumHour_Categ
的:
vec_categ
外国人和机场类别:locals_airport = rep(category[3,3], nrow = nrow(tbl1_airport))
这将重复foreig_airport = rep(category[3,2], nrow = nrow(tbl1_airport))
,category[3,2]
次中包含的值。
:nrow(tbl1_airport)
以及每个类别的每个向量(例如locals_workcluster = rep(category[1,3], nrow = nrow(tbl1_workcluster))
,prevHour_Categ
,sumHour_Categ
)等等