Question

我有包含长行（不同长度）整数的file.txt，用空格分隔。每行代表一些数组。我需要将它们读入R-载体，得到它们的中位数，并在一些R-向量中再次收集这些中位数，然后绘制它并返回最小值。

我在从行到向量中读取整数时遇到问题，或者我应该在这里使用其他结构？应该指定行数，还是可以在eof之前使用一些循环？

谁能给我一些例子怎么做？非常感谢。

Answer 1

试试这个

file<-read.table(file.choose(),dec=".",sep=" ",header=TRUE);
apply(file,1,median)

file.choose（）打开文件管理器，让您选择文件（它可能会给macyntosh带来一些问题） dec代表十进制符号（通常是。或） sep代表separtor（“，”代表csv，代表你的情况的空间）

apply是一个允许您将相同的公式应用于行和列的函数您必须为行指定1，为列指定2

如果您有标题

，请

标题

你的案子

file<-read.table('https://pastebin.com/raw/rXaEXAtv')
medians<-apply(file,1,median)
plot(medians)
min(medians)

就像Axeman指出的那样，只有在每一行具有相同的情况下才会起作用列数，即您可以组织数据框中的行

修改案例列数不等

file<-file('https://pastebin.com/raw/rXaEXAtv',open="rt") #open connection with the file
nFields <- count.fields(file) #find number of field for row
n=length(nFields) #find number of rows
close(file) #close connection we need to point to the start of the file
#(Windows seek is broken)

file<-file('https://pastebin.com/raw/rXaEXAtv',open="rt") #reopen connection, the pointer now is at first row again
data<-list() #init list data structure
scan(file,what=1,nlines=1,sep=" ") #skip first blank row

for(i in 1:(n) ){
data[[i]]=scan(file,what=1,nlines=1,sep=" ") # read line one each time
}
close(file) 


medians<-unlist(lapply(data,median))
plot(medians)
min(medians)

Answer 2

如果行包含相同数量的元素，

read.table()可以很好地工作并且是最快捷的方式。如果没有，这可能是最简单的方法：

a<-paste(readLines("asdf.txt"),collapse=" ") #get data, put it into one big character string
b<-strsplit(a,split=" ") #seperate integers by whitespaces
b<-as.integer(b[[1]]) #define them as integers
str(b)
# int [1:522] -3 -5 -2 3 6 3 -1 -2 -2 -2 ...

Answer 3

读取所有行（确保在末尾添加一个空行）

allLines <- readLines(con = 'file.txt', n = -1)

对空白区域的每一行进行标记

tokenize <- strsplit(allLines,split = ' ')

如果您希望结果为矩阵：

# Matrix
as_matrix <- sapply(tokenize, FUN = function(x) {as.integer(unlist(x))} )

如果您想将结果作为列表列表：

# as list of list
a <- lapply(tokenize, FUN = function(x) {as.integer(unlist(x))} )

R：从.txt到int向量读取行

3 个答案: