我正在使用sqldf
来分配一个庞大的文件。以下命令为我提供了100行和42列的data.frame。
first <- read.csv.sql("first.txt", sep = " ", header = TRUE, row.names = FALSE,
sql = "SELECT * FROM file WHERE n = '\"n63\"' AND ratio = 1 AND r_name = '\"r1\"' AND method = '\"nearest\"' AND variables = 10")
对象的结构是
'data.frame': 100 obs. of 42 variables:
$ test_before : chr "TRUE" "TRUE" "TRUE" "TRUE" ...
$ test_after : chr "TRUE" "TRUE" "TRUE" "TRUE" ...
$ meanPSmatchRATIO : chr "1.54845330373635" "1.16857102212364" "1.25330045961256" "1.8011651466717" ...
snipped intervening normally printed columns
$ PSdiff_DIFF : chr "-0.0103938442562762" "-0.00935228868105753" "-0.00947571480267878"
snipped intervening normally printed columns
$ nUNMATCHt : chr "0" "0" "0" "0" ...
$ caliper : chr "\"no\"" "\"no\"" "\"no\"" "\"no\"" ...
$ method : chr "\"nearest\"" "\"nearest\"" "\"nearest\"" "\"nearest\"" ...
$ r_name : chr "\"r1\"" "\"r1\"" "\"r1\"" "\"r1\"" ...
$ ratio : int 1 1 1 1 1 1 1 1 1 1 ...
$ n : chr "\"n63\"" "\"n63\"" "\"n63\"" "\"n63\"" ...
$ variables : int 10 10 10 10 10 10 10 10 10 10 ...
现在,基于此,您可以预期当我打印data.frame时,所有列(int
除外)都将是字符(用“”括起来)。但你错了!
test_before test_after meanPSmatchRATIO del- nUNMATCHt caliper method r_name ratio n variables
1 TRUE TRUE 1.54845330373635 eted 0 "no" "nearest" "r1" 1 "n63" 10
2 TRUE TRUE 1.16857102212364 ... 0 "no" "nearest" "r1" 1 "n63" 10
3 TRUE TRUE 1.25330045961256 ... 0 "no" "nearest" "r1" 1 "n63" 10
4 TRUE TRUE 1.8011651466717 ...t 0 "no" "nearest" "r1" 1 "n63" 10
请注意,只有最后几列是“character”。我在发生的事情上有点迷茫。有人可以解释一下吗?
答案 0 :(得分:5)
对我来说很好看。 print.data.frame
通常不打印字符列的引号,但最后几列都嵌入了引号,这就是为什么默认情况下出现“引用”的原因。
Data <- data.frame(x=1:5,y=as.character(1:5),
z=letters[1:5], q=paste("\"",letters[1:5],"\"",sep=""))
print(Data) # default print
# x y z q
# 1 1 1 a "a"
# 2 2 2 b "b"
# 3 3 3 c "c"
# 4 4 4 d "d"
# 5 5 5 e "e"
print(Data, quote=TRUE) # show embedded quotes
# x y z q
# 1 "1" "1" "a" "\"a\""
# 2 "2" "2" "b" "\"b\""
# 3 "3" "3" "c" "\"c\""
# 4 "4" "4" "d" "\"d\""
# 5 "5" "5" "e" "\"e\""
答案 1 :(得分:2)
您正在查看数据框对象的print
方法的默认行为。请参阅?print.data.frame
,其中包含:
quote: logical, indicating whether or not entries should be printed
with surrounding quotes.
因此,如果您希望引用打印对象,请使用quote = TRUE
。 E.g:
> dat <- data.frame(X = c("A","B"), Y = c("1","2"), stringsAsFactors = FALSE)
> dat
X Y
1 A 1
2 B 2
> dat[,1] ## not using the data frame print method...
[1] "A" "B"
> print(dat, quote = TRUE)
X Y
1 "A" "1"
2 "B" "2"
编辑关于@ Roman的评论,使用引号打印的列包含数据中的嵌入式引号。例如,caliper
的第一个元素是"\"no\""
,因此它是正在打印的嵌入式引号,因此与print.data.frame()
的默认行为完全一致。