我有一份约会的CSV文件,即2284x11。我创建了一个脚本,该脚本包含用户列表,创建用户拥有的所有约会的数据框,然后将其发送到.Rmd文件以导出为PDF。
当我尝试同时执行此操作时,收到以下错误:
Error in knitr::knit_meta_add(old_knit_meta, attr(old_knit_meta, "knit_meta_id")) :
long vectors not supported yet: ../../../../R-3.3.2/src/main/memory.c:1668
如果我一次只为少数用户运行我的脚本,直到覆盖整个列表,我就不会收到错误。这促使我认为我为每个用户创建的各个表都没有错(最大的是~70x7),但是我的脚本以某种方式以低效的方式缓存所有结果。我遵循了以下建议:
"long vectors not supported yet" error in Rmd but not in R Script
但关闭那些缓存设置并没有帮助。
有没有人知道偶然创建长矢量的常见原因?原始csv文件再次只有2248x11,小于250kb,我应用的大部分转换只是清理数据,子集和一些聚合。
有没有办法查看我的脚本在后台存储可能导致此错误的数据类型?
编辑:这是我认为相关的代码。我已经有sessions
的数据框(包括向主机支付),hosts
的数据框和一些个人信息,以及基于位置tax
值的数据框(影响主持人付款)。
以下R代码过滤主机名称下的会话,并将付款列中的值相加。它将结果表发送到.Rmd文件,以作为pdf导出。
for (i in 1:nrow(hosts)) {
#our hosts are paid on one of two monthly cycles
if (hosts[i,2] == cycle) {
#identify sessions under a host's name and that have payment associated
hostmatch = which(sessions$Host == hosts$Name[i] & sessions$Payment != "")
#only continue if host has sessions under their name with associated payments
if (length(hostmatch) > 0) {
hostsessions = sessions[hostmatch,] #filtering for matched sessions
#just renaming the columns to look better for exported PDF
colnames(hostsessions) <- c("Host","User","Appointment Date", "Appointment Time", "Appointment Type", "Appointment Status", "Payment to Host")
if (all(hostsessions[,7] == "$0.00")) {
#if all the host's recorded payments were for $0.00, skip to checking the next host
next
}
else {
#Payment is recorded as string starting with '$'. This adds up those values for a given host
Addup = sum(as.numeric((sub("$","", hostsessions$Payment, fixed = TRUE))))
if (hosts$PayTax[i] == "no") {
payout = paste("$", format(round(Addup, 2), nsmall = 2), sep="")
#This appends a final row that shows the host's total payout. Columns 1:5 are NAs and appear as blank cells in the PDF
hostsessions[nrow(hostsessions)+1, c(6,7)] <- c('Total:', payout)
}
else {
#modified version of previous loop that accounts for tax, which is listed by Province in a separate table
#Creates three rows at the end that consist largely of NAs
hosttax <- merge(hosts[i,], tax, by = 'Province') #Only one row, so subsetting with $ returns a single vector
hostsessions[nrow(hostsessions)+1, c(6,7)] <- c('Subtotal:', paste("$", format(round(Addup, 2), nsmall = 2), sep=""))
#Convoluted way to get an output in the format: 'HST (13%): $10.90'
hostsessions[nrow(hostsessions)+1, c(6,7)] <- c(paste(unlist(hosttax$TaxType), ' (', as.character((hosttax$TaxAmount - 1) * 100), '%)', sep = ''), paste("$", format(round(Addup*(hosttax$TaxAmount - 1), 2), nsmall = 2), sep=""))
payout = paste("$", format(round(Addup*hosttax$TaxAmount, 2), nsmall = 2), sep="")
hostsessions[nrow(hostsessions)+1, c(6,7)] <- c("Total:", payout)
}
#Send to Payments.Rmd to create pdf
rmarkdown::render(input = "Payments.Rmd",
output_format = "pdf_document",
output_file = paste(host$Name[i]," Statement ", date, ".pdf", sep=''),
output_dir = "~/")
}
}
}
}
.Rmd文件如下。我插入了一些\ usepackage语句来克服我遇到的其他错误,但我不完全理解为什么需要它们。
```{r, include = FALSE}
payment <- paste(hosts$Name[i], " Statement: ", date)
```
---
title: "`r payment`"
output: pdf_document
classoption: landscape
header-includes:
- \usepackage{float}
- \usepackage[table]{xcolor}
- \usepackage{graphicx}
- \usepackage{booktabs}
- \usepackage{longtable}
---
```{r, echo = FALSE}
library(knitr)
library(markdown)
library(rmarkdown)
#I experimented with different ways of setting cache to false. None of them seemed to work
options(cache = FALSE, warning = FALSE,
message = FALSE, cache.lazy = FALSE, knitr.kable.NA = '')
#formats the table previously created that showed all the appointments a host had and the payment associated
kable(hostsessions, format = "latex", booktabs = T, longtable = T) %>%
kable_styling(latex_options = c("striped", "repeat_header"), font_size = 9) %>%
row_spec(nrow(counsellorsessions), bold = T)
```