用fread跳过一些线条

时间:2015-06-22 05:29:16

标签: r fread

我有兴趣在标题名称之前跳过我的数据框的某些行。如何通过在ID_REF之前滑动所有行或如果ID_REF不存在来执行此操作,请检查模式ILMN_并删除所有行,如果不包含{{1 }}

#

1 个答案:

答案 0 :(得分:3)

在linux中,您可以将awkfread一起使用,也可以使用read.table进行管道传输。在这里,我使用,

将分隔符更改为awk
pth <- '/home/akrun/file.txt' #change it to your path
v1 <- sprintf("awk '/^(ID_REF|LMN)/{ matched = 1} matched {$1=$1; print}' OFS=\",\" %s", pth)

并阅读fread

library(data.table)
fread(v1)
#         ID_REF 1688628068_A.AVG_Signal 1688628068_A.Avg_NBEADS
#1: ILMN_1343291               62821.840                     135
#2: ILMN_1343292                3255.167                     131
#3: ILMN_1343293               42924.910                     152
#4: ILMN_1343294               55255.210                     100
#   1688628068_A.BEAD_STDERR 1688628068_A.Detection_Pval
#1:                413.93990                           0
#2:                 47.76587                           0
#3:                539.30260                           0
#4:                746.14570                           0

或使用read.table

read.table(pipe(v1), header=TRUE, sep=',', check.names=FALSE)
#       ID_REF 1688628068_A.AVG_Signal 1688628068_A.Avg_NBEADS
#1 ILMN_1343291               62821.840                     135
#2 ILMN_1343292                3255.167                     131
#3 ILMN_1343293               42924.910                     152
#4 ILMN_1343294               55255.210                     100
#  1688628068_A.BEAD_STDERR 1688628068_A.Detection_Pval
#1                413.93990                           0
#2                 47.76587                           0
#3                539.30260                           0
#4                746.14570                           0

注意:我已将列名从1688628068_A.Detection Pval更改为1688628068_A.Detection_Pval

由于某种原因,额外的空格会导致fread出现问题。使用read.table这不是问题。因此,以下内容也适用于read.table

 v2 <- sprintf("awk '/^(ID_REF|ILMN)/{ matched = 1} matched { print}' %s", pth)

 read.table(pipe(v2), header=TRUE, check.names=FALSE)
 #       ID_REF 1688628068_A.AVG_Signal 1688628068_A.Avg_NBEADS
 #1 ILMN_1343291               62821.840                     135
 #2 ILMN_1343292                3255.167                     131
 #3 ILMN_1343293               42924.910                     152
 #4 ILMN_1343294               55255.210                     100
 #  1688628068_A.BEAD_STDERR 1688628068_A.Detection_Pval
 #1                413.93990                           0
 #2                 47.76587                           0
 #3                539.30260                           0
 #4                746.14570                           0