Question

txt with 4 column divided by \t.

When I read it in this way:

A<-read.df(sqlContext,"/home/daniele/Tnt3.txt", "com.databricks.spark.csv")

SparkR read it all as one column

 a\tb\tc\td

How can I change the \t to , in sparkR?

(I know that I can change it manually like this sed -i 's/\t/,/g' file but is a little bit slowly)

Answer 1

a <- read.df(sqlContext, "/home/daniele/Tnt3.txt", "com.databricks.spark.csv", delimiter="\t")

Answer 2

You should specify delimiter.

Im newer in R, but i think is something like this

A<-read.df(sqlContext,"/home/daniele/Tnt3.txt", "com.databricks.spark.csv").options("delimiter", "\t")

for more info, visit page of spark-csv: