我有一个数据框,比如工资单,如:
payroll <- read.table(text="
AgencyName Rate PayBasis Status NumRate
HousingAuthority $26,843.00 Annual Full-Time 26843.00
HousingAuthority $14,970.00 ProratedAnnual Part-Time 14970.00
HousingAuthority $26,843.00 Annual Full-Time 26843.00
HousingAuthority $14,970.00 ProratedAnnual Part-Time 14970.00
HousingAuthority $13.50 Hourly Part-Time 13.50
HousingAuthority $14,970.00 ProratedAnnual Part-Time 14970.00
HousingAuthority $26,843.00 Annual Full-Time 26843.00", header = TRUE)
“NumRate”实际上是数字:
payroll$NumRate <- as.numeric(payroll$NumRate)
我想通过PayBasis了解最高,最低和平均工资。我希望这可行:
ddply(payroll, "PayBasis", summarize)
但我得到一个错误:Error: length(rows) == 1 is not TRUE
我在这里缺少什么?
答案 0 :(得分:4)
可能是因为你错误summarize
代表summary
(在这种情况下这不会像你期望的那样工作)。你可能想要:
ddply(payroll, "PayBasis", summarize,mx = max(NumRate),mn = min(NumRate),avg = mean(NumRate))
PayBasis mx mn avg
1 Annual 26843.0 26843.0 26843.0
2 Hourly 13.5 13.5 13.5
3 ProratedAnnual 14970.0 14970.0 14970.0
请务必仔细查看?summarize
和?ddply
中的示例。