How to remove white space from a data frame in R, when importing from SPSS

时间:2015-06-15 14:31:31

标签: r dataframe spss

I'm using read.spss in the "foreign" package to read in a .sav file to R.

This is survey data coming from an online survey. However, the results (via the SPSS file) contains large areas of white space in fields (I assume from text entry fields on the online form) these appear when I use write.csv.

For reference, this is the code I'm using:

dataset <- read.spss(file.choose(), to.data.frame=TRUE)

csv <- write.csv(dataset, file=file.choose(), append=FALSE, na="NA", row.names=FALSE, fileEncoding="UTF-8") 

Can I adjust this to replace the whitespace in the data frame with NA for my final csv output?

3 个答案:

答案 0 :(得分:0)

已解决:发现使用memisc包并用

替换原来的read.spss函数

dataset <- as.data.set(spss.system.file(file.choose())) 要么 dataset <- as.data.set(spss.portable.file(file.choose()))

避免自动输入大空格字符串。更多信息:Read SPSS file into R

答案 1 :(得分:0)

# if your data.frame object is `x`
library(stringr)

# convert all factor columns to character
facs <- sapply( x , is.factor )
x[ facs ] <- sapply( x[ facs ] , as.character )

# trim all character columns,
# removing leading and trailing whitespace
chars <- sapply( x , is.character )
x[ chars ] <- sapply( x[ chars ] , str_trim )

答案 2 :(得分:-1)

我猜错了Litte:

 static navigationOptions = { header: null }

应该是:

x[ facs ] <- sapply( x[ facs ] , as.character )

x[ facs ] <- lapply( x[ facs ] , as.character ) 代替lapply

(不知道为什么我几天以来一直在学习sapply语言。)