Question

我正在尝试使用grep和pipe()中的read.table()对2,075,259行的数据进行子集化。在这种情况下，搜索查询是两个连续的日期。但是，我的grep语法似乎有问题：

download.file("https://d396qusza40orc.cloudfront.net/exdata%2Fdata%2Fhousehold_power_consumption.zip",
 "file.zip")
unzip("file.zip")

data <- read.table(pipe('grep "^[1-2]/2/2007" "household_power_consumption.txt"'))

Error in read.table(pipe("grep \"^[1-2]/2/2007\" \"household_power_consumption.txt\"")) : 
  no lines available in input

我已经能够使用findstr（见下文）这样做，这很好，除了我希望代码在Windows之外工作，我相信findstr是特定于Windows的。< / p>

data <- read.table(pipe("findstr /B /R ^[1-2]/2/2007 household_power_consumption.txt"))  ## this works

有什么想法？（仅供参考，我知道我也可以使用sep =“;”，na.strings =“？”，colClasses等参数来加速数据加载。）

使用grep，pipe（）和read.table（）进行子集错误 - 初始数据加载

0 个答案: