Question

我正在使用：

R version 3.0.0 (2013-04-03) -- "Masked Marvel"
Platform: x86_64-pc-linux-gnu (64-bit)

我尝试使用read.csv直接从终端输入一个CSV数据片段+标题。

我遇到的问题可能与R skips lines from /dev/stdin和read.csv, header on first line, skip second line有关，但又有不同（答案中没有解释我在此处看到的内容）以保证单独的问题。

R似乎跳过标题行并将第二个（数据）行视为标题：

R> d <- read.csv(file='/dev/stdin', header=TRUE) 
a,b
1,2
3,4
# hit CTRL-D twice here to end the input
# (this is also unexpected:
#  when reading a few lines interactively in bash, one CTRL-D suffices.
#  Why is doing it twice necessary in R?)

R> d
  X1 X2
1  3  4

R> colnames(d)
[1] "X1" "X2"

我找到了一种解决方法：因为默认情况下read.csv有blank.lines.skip = TRUE，我在输入前面加上一些空白行。在开始输入之前有5个空行，似乎是使其按预期工作所需的最小行。 BTW：带有5个空格的单行也可以工作，暗示需要大约5个字节（或更多）的空格填充：

R> d <- read.csv(file='/dev/stdin', header=TRUE)





a,b
1,2
3,4
# Enter CTRL-D twice here to mark the end of terminal input

R> d
  a b
1 1 2
2 3 4

R> colnames(d)
[1] "a" "b"

问题：

为什么第一个例子不按预期工作？
为什么需要5个空行或空格（即使4个还不够）才能使其正常工作？
有没有更好的方法直接从终端读取短csv片段？（我知道scan和readLines，但我的数据已经是csv格式了，所以我想让它尽可能简单地阅读/解析/分配）

Answer 1

我认为您发布的第一个链接中的答案实际上可能适用。 R似乎在/ dev / stdin上创建了一个4字节的缓冲区。此外，正如评论中所提到的，您可以使用stdin代替它，它似乎工作正常。（虽然我仍然不明白为什么你必须按Ctrl + D两次）。

d <- read.csv(file='stdin', header=TRUE)
a,b
1,2
3,4
# Hit Control+D twice.
> d
  a b
1 1 2
2 3 4

如何使用/ dev / stdin和read.csv（）从终端读取输入？

1 个答案: