SparkR df read as one column

时间:2016-02-12 19:31:57

标签: csv apache-spark sparkr

txt with 4 column divided by \t.

When I read it in this way:

A<-read.df(sqlContext,"/home/daniele/Tnt3.txt", "com.databricks.spark.csv")

SparkR read it all as one column

 a\tb\tc\td

How can I change the \t to , in sparkR?

(I know that I can change it manually like this sed -i 's/\t/,/g' file but is a little bit slowly)

2 个答案:

答案 0 :(得分:1)

a <- read.df(sqlContext, "/home/daniele/Tnt3.txt", "com.databricks.spark.csv", delimiter="\t")

答案 1 :(得分:0)

You should specify delimiter.

Im newer in R, but i think is something like this

A<-read.df(sqlContext,"/home/daniele/Tnt3.txt", "com.databricks.spark.csv").options("delimiter", "\t")

for more info, visit page of spark-csv:

https://github.com/databricks/spark-csv