R:每XX个字符read.table分隔符

时间:2018-11-10 18:17:06

标签: r separator read.table

无法找到解决我问题的方法。

我正在尝试使用r读取文本文件。文件包含一行,并以字符数分隔。

000341656.0000000000000000004.6000000000000000009.0000000000000000050.9566787004000000052.0000000000000000072.8621215573000000007.0000000000000000050.0361010830000000047.2490974729000000054.5560183531000000006.0000000000000000049.9711191336000000047.0397111913000000043.1488475260000000023.0000000000000000046.6281588448000000040.1516245487000000038.4653540241000000002.0000000000000000046.2129963899000000041.9963898917000000037.3850068798000000030.0000000000000000046.0144404332000000040.0324909747000000027.0930952140000000003.0000000000000000043.3971119134000000032.4801444043000000010.4757238771

第一个值是20位浮点数。9 digit followed by 10 decimal digit

文件包含22到30个值,每个值长20位(小数位设置为“。”)

我无法弄清楚如何摆脱这个多余的0

任何帮助的线索都是高度赞赏的。

2 个答案:

答案 0 :(得分:0)

您可以使用read.fwf以固定宽度格式读取数据:

> read.fwf("./d.txt", widths=rep(20,30))
      V1  V2 V3       V4 V5       V6 V7      V8      V9      V10 V11      V12
1 341656 4.6  9 50.95668 52 72.86212  7 50.0361 47.2491 54.55602   6 49.97112
2 341656 4.6  9 50.95668 52 72.86212  7 50.0361 47.2491 54.55602   6 49.97112
       V13      V14 V15      V16      V17      V18 V19    V20      V21      V22
1 47.03971 43.14885  23 46.62816 40.15162 38.46535   2 46.213 41.99639 37.38501
2 47.03971 43.14885  23 46.62816 40.15162 38.46535   2 46.213 41.99639 37.38501
  V23      V24      V25     V26 V27      V28      V29      V30
1  30 46.01444 40.03249 27.0931   3 43.39711 32.48014 10.47572
2  30 46.01444 40.03249 27.0931   3 43.39711 32.48014 10.47572

您需要知道多少个字段以及它们的大小。您没有说文件中有多少行,但是我将您的行复制了两次(因此重复)。

答案 1 :(得分:0)

library(stringi)

stri_match_all_regex(
  "000341656.0000000000000000004.6000000000000000009.0000000000000000050.9566787004000000052.0000000000000000072.8621215573000000007.0000000000000000050.0361010830000000047.2490974729000000054.5560183531000000006.0000000000000000049.9711191336000000047.0397111913000000043.1488475260000000023.0000000000000000046.6281588448000000040.1516245487000000038.4653540241000000002.0000000000000000046.2129963899000000041.9963898917000000037.3850068798000000030.0000000000000000046.0144404332000000040.0324909747000000027.0930952140000000003.0000000000000000043.3971119134000000032.4801444043000000010.4757238771",
  ".{20}"
) %>% 
  unlist() %>% 
  as.numeric()
##  [1] 341656.00000      4.60000      9.00000     50.95668     52.00000
##  [6]     72.86212      7.00000     50.03610     47.24910     54.55602
## [11]      6.00000     49.97112     47.03971     43.14885     23.00000
## [16]     46.62816     40.15162     38.46535      2.00000     46.21300
## [21]     41.99639     37.38501     30.00000     46.01444     40.03249
## [26]     27.09310      3.00000     43.39711     32.48014     10.47572

也:

as.numeric(readChar("~/Data/20.txt", rep(20, file.size("~/Data/20.txt")/20)))