使用预先指定的列数将数据集读入格式良好的表中

时间:2018-04-07 11:34:44

标签: r read.table

我有sharedUserId这样:

sharedUserId

我想在txt.file中生成一个表格,其中与<{1}}文件中的相同的结构明显的16列

我尝试使用代码:

0003    MPARTNER  SALZ          S                           150112 22:30:45  160304 08:38:13  2      BUY                          2  BUY                  12380    165426  150109 08:00:00
0003    SPROTTSE  HUGHES        S                           140407 02:30:50  141120 13:55:06  2      BUY                          2  BUY                   3764     57379  140401 10:05:00
0003    SPROTTSE  HUGHES        S                           141223 09:06:13  160715 08:42:56  3      MARKETPERFORM                3  HOLD                  3764     57379  141223 08:02:00
001V    MPARTNER  PEARLSTEIN    D                           140821 02:44:05  150312 09:17:13  2      BUY                          2  BUY                  12380    163717  140820 08:16:00
001V    MPARTNER  PEARLSTEIN    D                           151016 15:07:40  160411 08:40:35  2      BUY                          2  BUY                  12380    163717  151009 08:12:00
001W    CANACCOR                K                           140321 04:06:40  140609 23:06:44         SPECULATIVE BUY              1  STRONG BUY             406    150412  140319 23:19:00
001W    CANACCOR  WRIGHT        K                           140714 12:47:31  160228 22:57:45         BUY                          1  STRONG BUY             406    150412  140714 12:38:00
001W    CLARUS    OFIR          E                           140515 11:40:00  150515 09:27:09         SPECULATIVE BUY              1  STRONG BUY             202    115944  140515 11:40:00
001W    CLARUS    MACKAY        D                           150813 09:40:45  160812 09:40:02         BUY                          1  STRONG BUY             202     73763  150813 09:23:00
001W    DEACON    OFIR          E                           150119 22:03:46  170328 06:45:14  1      BUY                          1  STRONG BUY             704    115944  150112 07:24:00
001W    DEACON    OFIR          E                           171115 06:48:47  171115 06:48:47  1      BUY                          1  STRONG BUY             704    115944  171115 06:42:00
@70L    MORGAN    MARTINEZ      J                           100226 07:12:51  100708 04:51:16  8      EQUALWT/NO RATING            3  HOLD                  1595     56947  100226 07:12:00
@70L    MORGAN    MARTINEZ DE O J                           100708 05:09:02  100910 00:48:28  6      EQUALWT/IN-LINE              3  HOLD                  1595     56947  100708 03:14:00
@70L    MORGAN    MARTINEZ DE O J                           100910 21:16:07  101110 21:55:52  2      OVERWT/IN-LINE               2  BUY                   1595     56947  100910 19:18:00
@70L    MORGAN    OLCOZ CERDAN  J                           101112 01:32:41  120618 21:04:56  2      OVERWT/IN-LINE               2  BUY                   1595     56947  101111 20:03:00
@70L    MORGAN    OLCOZ CERDAN  J                           120712 03:19:26  131216 19:49:59  6      EQUALWT/IN-LINE              3  HOLD                  1595     56947  120711 19:20:00
@70L    MORGAN    OLCOZ CERDAN  J                           140226 22:20:19  150417 13:07:31  2      OVERWT/IN-LINE               2  BUY                   1595     56947  140226 22:20:00
@70L    MORGAN                  J                           150608 01:25:35  171106 00:16:05  1      OVERWT/ATTRACTIVE            2  BUY                   1595     56947  150608 01:25:00

但我收到了一张奇怪的结构表:

R

如上所述,我希望收到一个包含 16列的表格,其结构位于txt。即使空字段(例如第6行)也应该保留。

E.g代表第6行:

enter image description here

你能帮我解决这个问题吗? 非常感谢。

1 个答案:

答案 0 :(得分:1)

一种选择是使用read.fwf

df <- read.fwf("tst.txt", widths = c(8, 10, 14, 28, 7, 10, 7, 10, 7, 29, 3,
     21, 9, 8, 7, 8), header = FALSE)

#Now next part will be to remove the leading/training whitespaces from character fields. 
library(dplyr)
df <- df %>% mutate_if(is.factor, function(x)trimws(as.character(x)))

数据框看起来像:

df
#      V1       V2            V3 V4     V5       V6     V7       V8 V9               V10 V11        V12   V13    V14    V15      V16
# 1  0003 MPARTNER          SALZ  S 150112 22:30:45 160304 08:38:13  2               BUY   2        BUY 12380 165426 150109 08:00:00
# 2  0003 SPROTTSE        HUGHES  S 140407 02:30:50 141120 13:55:06  2               BUY   2        BUY  3764  57379 140401 10:05:00
# 3  0003 SPROTTSE        HUGHES  S 141223 09:06:13 160715 08:42:56  3     MARKETPERFORM   3       HOLD  3764  57379 141223 08:02:00
# 4  001V MPARTNER    PEARLSTEIN  D 140821 02:44:05 150312 09:17:13  2               BUY   2        BUY 12380 163717 140820 08:16:00
# 5  001V MPARTNER    PEARLSTEIN  D 151016 15:07:40 160411 08:40:35  2               BUY   2        BUY 12380 163717 151009 08:12:00
# 6  001W CANACCOR                K 140321 04:06:40 140609 23:06:44 NA   SPECULATIVE BUY   1 STRONG BUY   406 150412 140319 23:19:00
# 7  001W CANACCOR        WRIGHT  K 140714 12:47:31 160228 22:57:45 NA               BUY   1 STRONG BUY   406 150412 140714 12:38:00
# 8  001W   CLARUS          OFIR  E 140515 11:40:00 150515 09:27:09 NA   SPECULATIVE BUY   1 STRONG BUY   202 115944 140515 11:40:00
# 9  001W   CLARUS        MACKAY  D 150813 09:40:45 160812 09:40:02 NA               BUY   1 STRONG BUY   202  73763 150813 09:23:00
# 10 001W   DEACON          OFIR  E 150119 22:03:46 170328 06:45:14  1               BUY   1 STRONG BUY   704 115944 150112 07:24:00
# 11 001W   DEACON          OFIR  E 171115 06:48:47 171115 06:48:47  1               BUY   1 STRONG BUY   704 115944 171115 06:42:00
# 12 @70L   MORGAN      MARTINEZ  J 100226 07:12:51 100708 04:51:16  8 EQUALWT/NO RATING   3       HOLD  1595  56947 100226 07:12:00
# 13 @70L   MORGAN MARTINEZ DE O  J 100708 05:09:02 100910 00:48:28  6   EQUALWT/IN-LINE   3       HOLD  1595  56947 100708 03:14:00
# 14 @70L   MORGAN MARTINEZ DE O  J 100910 21:16:07 101110 21:55:52  2    OVERWT/IN-LINE   2        BUY  1595  56947 100910 19:18:00
# 15 @70L   MORGAN  OLCOZ CERDAN  J 101112 01:32:41 120618 21:04:56  2    OVERWT/IN-LINE   2        BUY  1595  56947 101111 20:03:00
# 16 @70L   MORGAN  OLCOZ CERDAN  J 120712 03:19:26 131216 19:49:59  6   EQUALWT/IN-LINE   3       HOLD  1595  56947 120711 19:20:00
# 17 @70L   MORGAN  OLCOZ CERDAN  J 140226 22:20:19 150417 13:07:31  2    OVERWT/IN-LINE   2        BUY  1595  56947 140226 22:20:00
# 18 @70L   MORGAN                J 150608 01:25:35 171106 00:16:05  1 OVERWT/ATTRACTIVE   2        BUY  1595  56947 150608 01:25:00

以上data.frame有16列18行。