R中有两个分隔符的数据

时间:2016-01-20 22:58:26

标签: r dataframe

我有一个包含多个分隔符的文本文件。这是一个数据样本:

12 ->3 4 5
14->2 1
1->3 5 6

我想知道是否有一种简单的方法可以获得以下格式的数据:

12 3
12 4
12 5
14 2
14 1
 1 3
 1 5
 1 6

2 个答案:

答案 0 :(得分:8)

我试图用cat重现你的情况,并希望它是你真正拥有的。所以我们说这是你的文件

cat("12 ->3 4 5
     14->2 1
     1->3 5 6", 
    file = "test.txt")

使用data.table,我通过指定一些错误的分隔符快速阅读它,因此结果将是单列数据集

library(data.table)
dt <- fread("test.txt", 
            sep = ",", 
            header = FALSE)

下一步是双重拆分,首先分开箭头两侧的数字(->),然后按组拆分

dt[, tstrsplit(V1, "\\s*->\\s*", type.convert = TRUE)
   ][, strsplit(V2, "\\s+"), by = .(indx = V1)]
#    indx V1
# 1:   12  3
# 2:   12  4
# 3:   12  5
# 4:   14  2
# 5:   14  1
# 6:    1  3
# 7:    1  5
# 8:    1  6

答案 1 :(得分:4)

textConnection函数模拟文件的读取:

INT_PTR CALLBACK myTestCallBack(HWND hwndDlg, UINT uMsg, WPARAM wParam, LPARAM    lParam)
{
    switch(uMsg) {
        case WM_INITDIALOG:
            HWND myRichEditHWND = GetDlgItem (hwndDlg, IDC_TEST);
            // set some text length limit
            SendMessage (myRichEditHWND, EM_LIMITTEXT, (WPARAM)MAX_STRING_LEN - 1, 0);
            // we need key and mouse events, see WM_NOTIFY
            SendMessage (myRichEditHWND, EM_SETEVENTMASK, 0, ENM_KEYEVENTS | ENM_MOUSEEVENTS);
            // misc options
            SendMessage (myRichEditHWND, EM_SETOPTIONS, ECOOP_OR, ECO_AUTOWORDSELECTION | ECO_SELECTIONBAR);
            SendMessage (myRichEditHWND, EM_SETZOOM, 64, 52);
            return TRUE;
        case WM_COMMAND:
            switch (LOWORD (wParam)) {
                case IDOK:

                case IDCANCEL:
                    PostQuitMessage(0);
                    EndDialog(hwndDlg, 0);
                    return TRUE;
            }
            break;
        case WM_NOTIFY:
            {
                MessageBox(hwndDlg, L"Testing Dialog", L"Test", S_OK);
                MSGFILTER*  msgf = (MSGFILTER*)lParam;
                if (msgf && msgf->nmhdr.hwndFrom == myRichEditHWND)
                    spellchecker_process (myRichEditHWND, msgf);
                return FALSE;
            }
        default:
            break; 
    }
    return FALSE;
}