Question

我无法理解如何在SparkR 2.3.0中使用startsWith和endsWith函数。我认为我可以使用它像dplyr的starts_with命令，如下所示，但发生了错误。如果你好心地教我。

> df <- read.df("/hadoop/tmp/iris.csv", "csv", header = "true")
> showDF(select(df, startsWith(columns(df), "Sepal")))
Error in (function (classes, fdef, mtable)  :
  unable to find an inherited method for function 'select' for signature '"SparkDataFrame", "logical"'

Answer 1

startsWith和endsWith函数对列进行操作，而不是对数据帧进行操作。

要执行您尝试的选择，您可以使用

df <- as.DataFrame(iris)
df_sepal <- select(df, names(df)[grepl("Sepal", names(df))])

要使用startsWith（），您需要将列作为参数传递，以及要检查的字符串。例如，

df_v <- filter(df, startsWith(df$Species, "v") == TRUE)

将仅过滤Species以'v'开头的行（versicolor，virginica）

df_a <- filter(df, endsWith(df$Species, "a") == TRUE)

将仅过滤Species以'a'（setosa，viginica）结尾的行

如何在SparkR 2.3.0中使用startsWith和endsWith函数？

1 个答案: