Question

我有一个从文本文件导入的数据框：

                      dateTime contract bidPrice askPrice bidSize askSize
1 Tue Dec 14 05:12:20 PST 2010     CLF1        0        0       0       0
2 Tue Dec 14 05:12:20 PST 2010     CLG1        0        0       0       0
3 Tue Dec 14 05:12:20 PST 2010     CLH1        0        0       0       0
4 Tue Dec 14 05:12:20 PST 2010     CLJ1        0        0       0       0
5 Tue Dec 14 05:12:20 PST 2010     NGF1        0        0       0       0
6 Tue Dec 14 05:12:20 PST 2010     NGG1        0        0       0       0
  lastPrice lastSize volume
1         0        0      0
2         0        0      0
3         0        0      0
4         0        0      0
5         0        0      0
6         0        0      0

我尝试为contract = CLF1的所有行创建一个子集，但会收到以下错误：

> clf1 <- data.frame(subset(train2, as.character(contract="CLF1")))
Error in as.character(contract = "CLF1") : 
  supplied argument name 'contract' does not match 'x'

我试着找出单元格中有多少个字符：

> f <-as.character(train2[1,2])
> nchar(f)
[1] 5

我认为这是由于领先或尾随空间所以我尝试以下方法：

> clf1 <- data.frame(subset(train2, as.character(contract=" CLF1")))
Error in as.character(contract = " CLF1") : 
  supplied argument name 'contract' does not match 'x'
> clf1 <- data.frame(subset(train2, as.character(contract="CLF1 ")))
Error in as.character(contract = "CLF1 ") : 
  supplied argument name 'contract' does not match 'x'

再次没有运气，所以我在这里。任何建议都会很棒。谢谢。

编辑：

> clf1 <- subset(train2, contract == "CLF1")
> head(clf1)
[1] dateTime  contract  bidPrice  askPrice  bidSize   askSize   lastPrice lastSize 
[9] volume   
<0 rows> (or 0-length row.names)

Answer 1

我相信您的问题是您在subset中使用了错误的语法。试试这个：

subset(train2, contract == "CLF1")

因此，您不应该将子集表达式强制转换为字符，并且还需要使用==相等运算符，而不是=。您应该阅读?subset并查看其中的示例与您的代码之间的区别。

虽然您的字段可能确实包含前导空格，但在这种情况下最好尝试：

subset(train2, contract == " CLF1")

或者当您使用read.table读取文件时，可以使用strip.white参数来删除空格。

删除字符串中的空格

1 个答案: