I have a txt file like:
"cd_solicitud""nu_cuit""cd_provincia""tx_provincia"
"9531""203128827"18"Salta"
"9541""272477419"9"Entre Ríos"
"9571""273065780"2"Buenos Aires"
"6331""233703594"7"Córdoba"
"6351""272442465"5"Chaco"
I am trying to read it with:
prov_nos<-read.table("C:/.../prov_demo.txt",
header=T, quote = "\"")
But I get the following error:
Error in scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings, : line 1 doesn't have 4 elements
答案 0 :(得分:4)
As I sketched out in my comment, some variation on this:
l <- readLines("~/Desktop/scratch/no_delim.txt")
foo <- function(line){
line <- strsplit(line,"\"")[[1]]
line <- line[nchar(line) > 0]
line
}
l <- lapply(l,foo)
> setNames(as.data.frame(do.call(rbind,l[-1])),l[[1]])
cd_solicitud nu_cuit cd_provincia tx_provincia
1 9531 203128827 18 Salta
2 9541 272477419 9 Entre Ríos
3 9571 273065780 2 Buenos Aires
4 6331 233703594 7 Córdoba
5 6351 272442465 5 Chaco
I say "some variation" because if there are other odd characters, odd quoting or other gotchas in your file you may need to adjust the splitting and cleanup to handle those.
答案 1 :(得分:2)
You can hack it together if you read it in with readLines
and then use strsplit
to separate the elements of each row. It's not pretty, but then neither is the data's format:
the_text <- '"cd_solicitud""nu_cuit""cd_provincia""tx_provincia"
"9531""203128827"18"Salta"
"9541""272477419"9"Entre Ríos"
"9571""273065780"2"Buenos Aires"
"6331""233703594"7"Córdoba"
"6351""272442465"5"Chaco"'
the_text <- readLines(textConnection(the_text))
df <- data.frame(do.call(rbind, strsplit(the_text[-1], '"+')))
names(df) <- strsplit(the_text[1], '"+')[[1]]
df[,1] <- NULL
df
# cd_solicitud nu_cuit cd_provincia tx_provincia
# 1 9531 203128827 18 Salta
# 2 9541 272477419 9 Entre Ríos
# 3 9571 273065780 2 Buenos Aires
# 4 6331 233703594 7 Córdoba
# 5 6351 272442465 5 Chaco