数据框列表::如何在特定数据框中编辑特定列?

时间:2019-07-07 11:28:25

标签: r dplyr purrr

当数据框是数据框列表的一部分时,我一直试图用一种明智的方式来编辑数据框

棘手的部分是,我不想硬编码list-> dataframe-> column内的位置,而是以编程的方式挖掘自己的方式,编辑特定的列,并将更改永久化(例如,update /重新分配)。

我的真实问题例子


## I have data collected from ~100 participants in an experiment.
## Each participant's data is organized within a single .txt file.
## All files have the same structure, i.e., same variables.
## I want to:
##  1. Load all .txt files as elements in one R list object.
##  2. Make some checks (and consequently correct) duplicate participant IDs,
##     missing IDs (figure out why some are missing), and other housekeeping 
##     tasks that are closely related to the .txt files 
##     (though not editing the raw files in the directory but their representation as list elements).
##  3. Convert the list into a dataframe.
##  4. Some more housekeeping on a single-dataframe level (e.g., aggregation,
##     computing new variables, etc.)
##  5. Analyze data.


## data as list of dataframes (only 6 participants to keep it minimal)

data_list <- 
list(`Task_II_Final_Emmanuel_5may2019-94-1.txt` = structure(list(
    Eprime.Level = c(1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 
    2, 2, 2, 2, 2, 2, 2), Eprime.LevelName = c("Header_", "trainingList_13", 
    "trainingList_12", "trainingList_17", "trainingList_15", 
    "trainingList_14", "trainingList_6", "trainingList_1", "trainingList_11", 
    "trainingList_19", "trainingList_20", "trainingList_8", "trainingList_7", 
    "trainingList_16", "trainingList_4", "trainingList_5", "trainingList_2", 
    "trainingList_18", "trainingList_9", "trainingList_3", "trainingList_10"
    ), Eprime.Basename = c("Task_II_Final_Emmanuel_5may2019-94-1", 
    "Task_II_Final_Emmanuel_5may2019-94-1", "Task_II_Final_Emmanuel_5may2019-94-1", 
    "Task_II_Final_Emmanuel_5may2019-94-1", "Task_II_Final_Emmanuel_5may2019-94-1", 
    "Task_II_Final_Emmanuel_5may2019-94-1", "Task_II_Final_Emmanuel_5may2019-94-1", 
    "Task_II_Final_Emmanuel_5may2019-94-1", "Task_II_Final_Emmanuel_5may2019-94-1", 
    "Task_II_Final_Emmanuel_5may2019-94-1", "Task_II_Final_Emmanuel_5may2019-94-1", 
    "Task_II_Final_Emmanuel_5may2019-94-1", "Task_II_Final_Emmanuel_5may2019-94-1", 
    "Task_II_Final_Emmanuel_5may2019-94-1", "Task_II_Final_Emmanuel_5may2019-94-1", 
    "Task_II_Final_Emmanuel_5may2019-94-1", "Task_II_Final_Emmanuel_5may2019-94-1", 
    "Task_II_Final_Emmanuel_5may2019-94-1", "Task_II_Final_Emmanuel_5may2019-94-1", 
    "Task_II_Final_Emmanuel_5may2019-94-1", "Task_II_Final_Emmanuel_5may2019-94-1"
    ), Eprime.FrameNumber = c("1", "2", "3", "4", "5", "6", "7", 
    "8", "9", "10", "11", "12", "13", "14", "15", "16", "17", 
    "18", "19", "20", "21"), Procedure = c("Header", "trainProc", 
    "trainProc", "trainProc", "trainProc", "trainProc", "trainProc", 
    "trainProc", "trainProc", "trainProc", "trainProc", "trainProc", 
    "trainProc", "trainProc", "trainProc", "trainProc", "trainProc", 
    "trainProc", "trainProc", "trainProc", "trainProc"), Running = c("Header", 
    "trainingList", "trainingList", "trainingList", "trainingList", 
    "trainingList", "trainingList", "trainingList", "trainingList", 
    "trainingList", "trainingList", "trainingList", "trainingList", 
    "trainingList", "trainingList", "trainingList", "trainingList", 
    "trainingList", "trainingList", "trainingList", "trainingList"
    ), VersionPersist = c("1", NA, NA, NA, NA, NA, NA, NA, NA, 
    NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA), LevelName = c("LogLevel10", 
    NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
    NA, NA, NA, NA, NA), Experiment = c("Task_II_Final_Emmanuel_5may2019", 
    NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
    NA, NA, NA, NA, NA), SessionDate = c("06-16-2019", NA, NA, 
    NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
    NA, NA, NA), SessionTime = c("16:50:30", NA, NA, NA, NA, 
    NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
    NA), SessionStartDateTimeUtc = c("6/16/2019 1:50:30 PM", 
    NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
    NA, NA, NA, NA, NA), Subject = c("94", NA, NA, NA, NA, NA, 
    NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA
    ), Session = c("1", NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
    NA, NA, NA, NA, NA, NA, NA, NA, NA, NA), Sex = c("female", 
    NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
    NA, NA, NA, NA, NA), DataFile.Basename = c("Task_II_Final_Emmanuel_5may2019-94-1", 
    NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
    NA, NA, NA, NA, NA), RandomSeed = c("708214787", NA, NA, 
    NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
    NA, NA, NA), Group = c("1", NA, NA, NA, NA, NA, NA, NA, NA, 
    NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA), Display.RefreshRate = c("60.000", 
    NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
    NA, NA, NA, NA, NA), ID = c(NA, "13", "12", "17", "15", "14", 
    "6", "1", "11", "19", "20", "8", "7", "16", "4", "5", "2", 
    "18", "9", "3", "10")), row.names = c(NA, 21L), class = "data.frame"), 
    `Task_II_Final_Emmanuel_5may2019-95-1.txt` = structure(list(
        Eprime.Level = c(1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 
        2, 2, 2, 2, 2, 2, 2, 2, 2), Eprime.LevelName = c("Header_", 
        "trainingList_13", "trainingList_11", "trainingList_4", 
        "trainingList_10", "trainingList_8", "trainingList_15", 
        "trainingList_16", "trainingList_6", "trainingList_9", 
        "trainingList_12", "trainingList_14", "trainingList_3", 
        "trainingList_19", "trainingList_7", "trainingList_1", 
        "trainingList_2", "trainingList_20", "trainingList_18", 
        "trainingList_17", "trainingList_5"), Eprime.Basename = c("Task_II_Final_Emmanuel_5may2019-95-1", 
        "Task_II_Final_Emmanuel_5may2019-95-1", "Task_II_Final_Emmanuel_5may2019-95-1", 
        "Task_II_Final_Emmanuel_5may2019-95-1", "Task_II_Final_Emmanuel_5may2019-95-1", 
        "Task_II_Final_Emmanuel_5may2019-95-1", "Task_II_Final_Emmanuel_5may2019-95-1", 
        "Task_II_Final_Emmanuel_5may2019-95-1", "Task_II_Final_Emmanuel_5may2019-95-1", 
        "Task_II_Final_Emmanuel_5may2019-95-1", "Task_II_Final_Emmanuel_5may2019-95-1", 
        "Task_II_Final_Emmanuel_5may2019-95-1", "Task_II_Final_Emmanuel_5may2019-95-1", 
        "Task_II_Final_Emmanuel_5may2019-95-1", "Task_II_Final_Emmanuel_5may2019-95-1", 
        "Task_II_Final_Emmanuel_5may2019-95-1", "Task_II_Final_Emmanuel_5may2019-95-1", 
        "Task_II_Final_Emmanuel_5may2019-95-1", "Task_II_Final_Emmanuel_5may2019-95-1", 
        "Task_II_Final_Emmanuel_5may2019-95-1", "Task_II_Final_Emmanuel_5may2019-95-1"
        ), Eprime.FrameNumber = c("1", "2", "3", "4", "5", "6", 
        "7", "8", "9", "10", "11", "12", "13", "14", "15", "16", 
        "17", "18", "19", "20", "21"), Procedure = c("Header", 
        "trainProc", "trainProc", "trainProc", "trainProc", "trainProc", 
        "trainProc", "trainProc", "trainProc", "trainProc", "trainProc", 
        "trainProc", "trainProc", "trainProc", "trainProc", "trainProc", 
        "trainProc", "trainProc", "trainProc", "trainProc", "trainProc"
        ), Running = c("Header", "trainingList", "trainingList", 
        "trainingList", "trainingList", "trainingList", "trainingList", 
        "trainingList", "trainingList", "trainingList", "trainingList", 
        "trainingList", "trainingList", "trainingList", "trainingList", 
        "trainingList", "trainingList", "trainingList", "trainingList", 
        "trainingList", "trainingList"), VersionPersist = c("1", 
        NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
        NA, NA, NA, NA, NA, NA), LevelName = c("LogLevel10", 
        NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
        NA, NA, NA, NA, NA, NA), Experiment = c("Task_II_Final_Emmanuel_5may2019", 
        NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
        NA, NA, NA, NA, NA, NA), SessionDate = c("06-18-2019", 
        NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
        NA, NA, NA, NA, NA, NA), SessionTime = c("15:19:50", 
        NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
        NA, NA, NA, NA, NA, NA), SessionStartDateTimeUtc = c("18/06/2019 12:19:50", 
        NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
        NA, NA, NA, NA, NA, NA), Subject = c("95", NA, NA, NA, 
        NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
        NA, NA, NA), Session = c("1", NA, NA, NA, NA, NA, NA, 
        NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA
        ), Sex = c("female", NA, NA, NA, NA, NA, NA, NA, NA, 
        NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA), DataFile.Basename = c("Task_II_Final_Emmanuel_5may2019-95-1", 
        NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
        NA, NA, NA, NA, NA, NA), RandomSeed = c("-2031275760", 
        NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
        NA, NA, NA, NA, NA, NA), Group = c("1", NA, NA, NA, NA, 
        NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
        NA, NA), Display.RefreshRate = c("60.000", NA, NA, NA, 
        NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
        NA, NA, NA), ID = c(NA, "13", "11", "4", "10", "8", "15", 
        "16", "6", "9", "12", "14", "3", "19", "7", "1", "2", 
        "20", "18", "17", "5")), row.names = c(NA, 21L), class = "data.frame"), 
    `Task_II_Final_Emmanuel_5may2019-96-1.txt` = structure(list(
        Eprime.Level = c(1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 
        2, 2, 2, 2, 2, 2, 2, 2, 2), Eprime.LevelName = c("Header_", 
        "trainingList_20", "trainingList_11", "trainingList_7", 
        "trainingList_2", "trainingList_13", "trainingList_5", 
        "trainingList_19", "trainingList_1", "trainingList_8", 
        "trainingList_16", "trainingList_18", "trainingList_12", 
        "trainingList_3", "trainingList_15", "trainingList_6", 
        "trainingList_17", "trainingList_4", "trainingList_14", 
        "trainingList_9", "trainingList_10"), Eprime.Basename = c("Task_II_Final_Emmanuel_5may2019-96-1", 
        "Task_II_Final_Emmanuel_5may2019-96-1", "Task_II_Final_Emmanuel_5may2019-96-1", 
        "Task_II_Final_Emmanuel_5may2019-96-1", "Task_II_Final_Emmanuel_5may2019-96-1", 
        "Task_II_Final_Emmanuel_5may2019-96-1", "Task_II_Final_Emmanuel_5may2019-96-1", 
        "Task_II_Final_Emmanuel_5may2019-96-1", "Task_II_Final_Emmanuel_5may2019-96-1", 
        "Task_II_Final_Emmanuel_5may2019-96-1", "Task_II_Final_Emmanuel_5may2019-96-1", 
        "Task_II_Final_Emmanuel_5may2019-96-1", "Task_II_Final_Emmanuel_5may2019-96-1", 
        "Task_II_Final_Emmanuel_5may2019-96-1", "Task_II_Final_Emmanuel_5may2019-96-1", 
        "Task_II_Final_Emmanuel_5may2019-96-1", "Task_II_Final_Emmanuel_5may2019-96-1", 
        "Task_II_Final_Emmanuel_5may2019-96-1", "Task_II_Final_Emmanuel_5may2019-96-1", 
        "Task_II_Final_Emmanuel_5may2019-96-1", "Task_II_Final_Emmanuel_5may2019-96-1"
        ), Eprime.FrameNumber = c("1", "2", "3", "4", "5", "6", 
        "7", "8", "9", "10", "11", "12", "13", "14", "15", "16", 
        "17", "18", "19", "20", "21"), Procedure = c("Header", 
        "trainProc", "trainProc", "trainProc", "trainProc", "trainProc", 
        "trainProc", "trainProc", "trainProc", "trainProc", "trainProc", 
        "trainProc", "trainProc", "trainProc", "trainProc", "trainProc", 
        "trainProc", "trainProc", "trainProc", "trainProc", "trainProc"
        ), Running = c("Header", "trainingList", "trainingList", 
        "trainingList", "trainingList", "trainingList", "trainingList", 
        "trainingList", "trainingList", "trainingList", "trainingList", 
        "trainingList", "trainingList", "trainingList", "trainingList", 
        "trainingList", "trainingList", "trainingList", "trainingList", 
        "trainingList", "trainingList"), VersionPersist = c("1", 
        NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
        NA, NA, NA, NA, NA, NA), LevelName = c("LogLevel10", 
        NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
        NA, NA, NA, NA, NA, NA), Experiment = c("Task_II_Final_Emmanuel_5may2019", 
        NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
        NA, NA, NA, NA, NA, NA), SessionDate = c("06-18-2019", 
        NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
        NA, NA, NA, NA, NA, NA), SessionTime = c("16:11:39", 
        NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
        NA, NA, NA, NA, NA, NA), SessionStartDateTimeUtc = c("6/18/2019 1:11:39 PM", 
        NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
        NA, NA, NA, NA, NA, NA), Subject = c("96", NA, NA, NA, 
        NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
        NA, NA, NA), Session = c("1", NA, NA, NA, NA, NA, NA, 
        NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA
        ), Sex = c("female", NA, NA, NA, NA, NA, NA, NA, NA, 
        NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA), DataFile.Basename = c("Task_II_Final_Emmanuel_5may2019-96-1", 
        NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
        NA, NA, NA, NA, NA, NA), RandomSeed = c("426136076", 
        NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
        NA, NA, NA, NA, NA, NA), Group = c("1", NA, NA, NA, NA, 
        NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
        NA, NA), Display.RefreshRate = c("60.001", NA, NA, NA, 
        NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
        NA, NA, NA), ID = c(NA, "20", "11", "7", "2", "13", "5", 
        "19", "1", "8", "16", "18", "12", "3", "15", "6", "17", 
        "4", "14", "9", "10")), row.names = c(NA, 21L), class = "data.frame"), 
    `Task_II_Final_Emmanuel_5may2019-98-1.txt` = structure(list(
        Eprime.Level = c(1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 
        2, 2, 2, 2, 2, 2, 2, 2, 2), Eprime.LevelName = c("Header_", 
        "trainingList_9", "trainingList_3", "trainingList_5", 
        "trainingList_8", "trainingList_14", "trainingList_7", 
        "trainingList_20", "trainingList_16", "trainingList_6", 
        "trainingList_10", "trainingList_13", "trainingList_11", 
        "trainingList_12", "trainingList_18", "trainingList_1", 
        "trainingList_17", "trainingList_2", "trainingList_4", 
        "trainingList_15", "trainingList_19"), Eprime.Basename = c("Task_II_Final_Emmanuel_5may2019-98-1", 
        "Task_II_Final_Emmanuel_5may2019-98-1", "Task_II_Final_Emmanuel_5may2019-98-1", 
        "Task_II_Final_Emmanuel_5may2019-98-1", "Task_II_Final_Emmanuel_5may2019-98-1", 
        "Task_II_Final_Emmanuel_5may2019-98-1", "Task_II_Final_Emmanuel_5may2019-98-1", 
        "Task_II_Final_Emmanuel_5may2019-98-1", "Task_II_Final_Emmanuel_5may2019-98-1", 
        "Task_II_Final_Emmanuel_5may2019-98-1", "Task_II_Final_Emmanuel_5may2019-98-1", 
        "Task_II_Final_Emmanuel_5may2019-98-1", "Task_II_Final_Emmanuel_5may2019-98-1", 
        "Task_II_Final_Emmanuel_5may2019-98-1", "Task_II_Final_Emmanuel_5may2019-98-1", 
        "Task_II_Final_Emmanuel_5may2019-98-1", "Task_II_Final_Emmanuel_5may2019-98-1", 
        "Task_II_Final_Emmanuel_5may2019-98-1", "Task_II_Final_Emmanuel_5may2019-98-1", 
        "Task_II_Final_Emmanuel_5may2019-98-1", "Task_II_Final_Emmanuel_5may2019-98-1"
        ), Eprime.FrameNumber = c("1", "2", "3", "4", "5", "6", 
        "7", "8", "9", "10", "11", "12", "13", "14", "15", "16", 
        "17", "18", "19", "20", "21"), Procedure = c("Header", 
        "trainProc", "trainProc", "trainProc", "trainProc", "trainProc", 
        "trainProc", "trainProc", "trainProc", "trainProc", "trainProc", 
        "trainProc", "trainProc", "trainProc", "trainProc", "trainProc", 
        "trainProc", "trainProc", "trainProc", "trainProc", "trainProc"
        ), Running = c("Header", "trainingList", "trainingList", 
        "trainingList", "trainingList", "trainingList", "trainingList", 
        "trainingList", "trainingList", "trainingList", "trainingList", 
        "trainingList", "trainingList", "trainingList", "trainingList", 
        "trainingList", "trainingList", "trainingList", "trainingList", 
        "trainingList", "trainingList"), VersionPersist = c("1", 
        NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
        NA, NA, NA, NA, NA, NA), LevelName = c("LogLevel10", 
        NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
        NA, NA, NA, NA, NA, NA), Experiment = c("Task_II_Final_Emmanuel_5may2019", 
        NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
        NA, NA, NA, NA, NA, NA), SessionDate = c("06-19-2019", 
        NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
        NA, NA, NA, NA, NA, NA), SessionTime = c("10:12:12", 
        NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
        NA, NA, NA, NA, NA, NA), SessionStartDateTimeUtc = c("19/06/2019 7:12:12", 
        NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
        NA, NA, NA, NA, NA, NA), Subject = c("98", NA, NA, NA, 
        NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
        NA, NA, NA), Session = c("1", NA, NA, NA, NA, NA, NA, 
        NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA
        ), Sex = c("female", NA, NA, NA, NA, NA, NA, NA, NA, 
        NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA), DataFile.Basename = c("Task_II_Final_Emmanuel_5may2019-98-1", 
        NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
        NA, NA, NA, NA, NA, NA), RandomSeed = c("-213300967", 
        NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
        NA, NA, NA, NA, NA, NA), Group = c("1", NA, NA, NA, NA, 
        NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
        NA, NA), Display.RefreshRate = c("60.000", NA, NA, NA, 
        NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
        NA, NA, NA), ID = c(NA, "9", "3", "5", "8", "14", "7", 
        "20", "16", "6", "10", "13", "11", "12", "18", "1", "17", 
        "2", "4", "15", "19")), row.names = c(NA, 21L), class = "data.frame"), 
    `Task_II_Final_Emmanuel_5may2019-98-1 (2).txt` = structure(list(
        Eprime.Level = c(1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 
        2, 2, 2, 2, 2, 2, 2, 2, 2), Eprime.LevelName = c("Header_", 
        "trainingList_19", "trainingList_5", "trainingList_8", 
        "trainingList_3", "trainingList_4", "trainingList_14", 
        "trainingList_1", "trainingList_2", "trainingList_9", 
        "trainingList_6", "trainingList_13", "trainingList_20", 
        "trainingList_11", "trainingList_10", "trainingList_7", 
        "trainingList_12", "trainingList_15", "trainingList_17", 
        "trainingList_16", "trainingList_18"), Eprime.Basename = c("Task_II_Final_Emmanuel_5may2019-98-1 (2)", 
        "Task_II_Final_Emmanuel_5may2019-98-1 (2)", "Task_II_Final_Emmanuel_5may2019-98-1 (2)", 
        "Task_II_Final_Emmanuel_5may2019-98-1 (2)", "Task_II_Final_Emmanuel_5may2019-98-1 (2)", 
        "Task_II_Final_Emmanuel_5may2019-98-1 (2)", "Task_II_Final_Emmanuel_5may2019-98-1 (2)", 
        "Task_II_Final_Emmanuel_5may2019-98-1 (2)", "Task_II_Final_Emmanuel_5may2019-98-1 (2)", 
        "Task_II_Final_Emmanuel_5may2019-98-1 (2)", "Task_II_Final_Emmanuel_5may2019-98-1 (2)", 
        "Task_II_Final_Emmanuel_5may2019-98-1 (2)", "Task_II_Final_Emmanuel_5may2019-98-1 (2)", 
        "Task_II_Final_Emmanuel_5may2019-98-1 (2)", "Task_II_Final_Emmanuel_5may2019-98-1 (2)", 
        "Task_II_Final_Emmanuel_5may2019-98-1 (2)", "Task_II_Final_Emmanuel_5may2019-98-1 (2)", 
        "Task_II_Final_Emmanuel_5may2019-98-1 (2)", "Task_II_Final_Emmanuel_5may2019-98-1 (2)", 
        "Task_II_Final_Emmanuel_5may2019-98-1 (2)", "Task_II_Final_Emmanuel_5may2019-98-1 (2)"
        ), Eprime.FrameNumber = c("1", "2", "3", "4", "5", "6", 
        "7", "8", "9", "10", "11", "12", "13", "14", "15", "16", 
        "17", "18", "19", "20", "21"), Procedure = c("Header", 
        "trainProc", "trainProc", "trainProc", "trainProc", "trainProc", 
        "trainProc", "trainProc", "trainProc", "trainProc", "trainProc", 
        "trainProc", "trainProc", "trainProc", "trainProc", "trainProc", 
        "trainProc", "trainProc", "trainProc", "trainProc", "trainProc"
        ), Running = c("Header", "trainingList", "trainingList", 
        "trainingList", "trainingList", "trainingList", "trainingList", 
        "trainingList", "trainingList", "trainingList", "trainingList", 
        "trainingList", "trainingList", "trainingList", "trainingList", 
        "trainingList", "trainingList", "trainingList", "trainingList", 
        "trainingList", "trainingList"), VersionPersist = c("1", 
        NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
        NA, NA, NA, NA, NA, NA), LevelName = c("LogLevel10", 
        NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
        NA, NA, NA, NA, NA, NA), Experiment = c("Task_II_Final_Emmanuel_5may2019", 
        NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
        NA, NA, NA, NA, NA, NA), SessionDate = c("06-20-2019", 
        NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
        NA, NA, NA, NA, NA, NA), SessionTime = c("12:24:02", 
        NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
        NA, NA, NA, NA, NA, NA), SessionStartDateTimeUtc = c("20/06/2019 9:24:02", 
        NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
        NA, NA, NA, NA, NA, NA), Subject = c("98", NA, NA, NA, 
        NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
        NA, NA, NA), Session = c("1", NA, NA, NA, NA, NA, NA, 
        NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA
        ), Sex = c("female", NA, NA, NA, NA, NA, NA, NA, NA, 
        NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA), DataFile.Basename = c("Task_II_Final_Emmanuel_5may2019-98-1", 
        NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
        NA, NA, NA, NA, NA, NA), RandomSeed = c("1709662965", 
        NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
        NA, NA, NA, NA, NA, NA), Group = c("1", NA, NA, NA, NA, 
        NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
        NA, NA), Display.RefreshRate = c("60.000", NA, NA, NA, 
        NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
        NA, NA, NA), ID = c(NA, "19", "5", "8", "3", "4", "14", 
        "1", "2", "9", "6", "13", "20", "11", "10", "7", "12", 
        "15", "17", "16", "18")), row.names = c(NA, 21L), class = "data.frame"), 
    `Task_II_Final_Emmanuel_5may2019-99-1.txt` = structure(list(
        Eprime.Level = c(1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 
        2, 2, 2, 2, 2, 2, 2, 2, 2), Eprime.LevelName = c("Header_", 
        "trainingList_16", "trainingList_3", "trainingList_20", 
        "trainingList_17", "trainingList_12", "trainingList_8", 
        "trainingList_1", "trainingList_15", "trainingList_4", 
        "trainingList_11", "trainingList_13", "trainingList_14", 
        "trainingList_2", "trainingList_5", "trainingList_10", 
        "trainingList_7", "trainingList_19", "trainingList_9", 
        "trainingList_18", "trainingList_6"), Eprime.Basename = c("Task_II_Final_Emmanuel_5may2019-99-1", 
        "Task_II_Final_Emmanuel_5may2019-99-1", "Task_II_Final_Emmanuel_5may2019-99-1", 
        "Task_II_Final_Emmanuel_5may2019-99-1", "Task_II_Final_Emmanuel_5may2019-99-1", 
        "Task_II_Final_Emmanuel_5may2019-99-1", "Task_II_Final_Emmanuel_5may2019-99-1", 
        "Task_II_Final_Emmanuel_5may2019-99-1", "Task_II_Final_Emmanuel_5may2019-99-1", 
        "Task_II_Final_Emmanuel_5may2019-99-1", "Task_II_Final_Emmanuel_5may2019-99-1", 
        "Task_II_Final_Emmanuel_5may2019-99-1", "Task_II_Final_Emmanuel_5may2019-99-1", 
        "Task_II_Final_Emmanuel_5may2019-99-1", "Task_II_Final_Emmanuel_5may2019-99-1", 
        "Task_II_Final_Emmanuel_5may2019-99-1", "Task_II_Final_Emmanuel_5may2019-99-1", 
        "Task_II_Final_Emmanuel_5may2019-99-1", "Task_II_Final_Emmanuel_5may2019-99-1", 
        "Task_II_Final_Emmanuel_5may2019-99-1", "Task_II_Final_Emmanuel_5may2019-99-1"
        ), Eprime.FrameNumber = c("1", "2", "3", "4", "5", "6", 
        "7", "8", "9", "10", "11", "12", "13", "14", "15", "16", 
        "17", "18", "19", "20", "21"), Procedure = c("Header", 
        "trainProc", "trainProc", "trainProc", "trainProc", "trainProc", 
        "trainProc", "trainProc", "trainProc", "trainProc", "trainProc", 
        "trainProc", "trainProc", "trainProc", "trainProc", "trainProc", 
        "trainProc", "trainProc", "trainProc", "trainProc", "trainProc"
        ), Running = c("Header", "trainingList", "trainingList", 
        "trainingList", "trainingList", "trainingList", "trainingList", 
        "trainingList", "trainingList", "trainingList", "trainingList", 
        "trainingList", "trainingList", "trainingList", "trainingList", 
        "trainingList", "trainingList", "trainingList", "trainingList", 
        "trainingList", "trainingList"), VersionPersist = c("1", 
        NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
        NA, NA, NA, NA, NA, NA), LevelName = c("LogLevel10", 
        NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
        NA, NA, NA, NA, NA, NA), Experiment = c("Task_II_Final_Emmanuel_5may2019", 
        NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
        NA, NA, NA, NA, NA, NA), SessionDate = c("06-20-2019", 
        NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
        NA, NA, NA, NA, NA, NA), SessionTime = c("13:01:11", 
        NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
        NA, NA, NA, NA, NA, NA), SessionStartDateTimeUtc = c("20/06/2019 10:01:11", 
        NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
        NA, NA, NA, NA, NA, NA), Subject = c("99", NA, NA, NA, 
        NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
        NA, NA, NA), Session = c("1", NA, NA, NA, NA, NA, NA, 
        NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA
        ), Sex = c("male", NA, NA, NA, NA, NA, NA, NA, NA, NA, 
        NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA), DataFile.Basename = c("Task_II_Final_Emmanuel_5may2019-99-1", 
        NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
        NA, NA, NA, NA, NA, NA), RandomSeed = c("1953053728", 
        NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
        NA, NA, NA, NA, NA, NA), Group = c("1", NA, NA, NA, NA, 
        NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
        NA, NA), Display.RefreshRate = c("60.000", NA, NA, NA, 
        NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
        NA, NA, NA), ID = c(NA, "16", "3", "20", "17", "12", 
        "8", "1", "15", "4", "11", "13", "14", "2", "5", "10", 
        "7", "19", "9", "18", "6")), row.names = c(NA, 21L), class = "data.frame"))

尝试解决问题

## I'm concerned about duplicates, so I want to extract all list items
## with their name and a timestamp so I could double check duplicates
## against my notes

library("dplyr")
library("purrr")

w_timestamps <-
  data_list %>%
    sapply(function(x) x[11][[1]]) %>% ## column 11 in each dataframe 
                                       ## has the session start time
    .[1,]

> w_timestamps
    Task_II_Final_Emmanuel_5may2019-94-1.txt     Task_II_Final_Emmanuel_5may2019-95-1.txt     Task_II_Final_Emmanuel_5may2019-96-1.txt 
                                  "16:50:30"                                   "15:19:50"                                   "16:11:39" 
    Task_II_Final_Emmanuel_5may2019-98-1.txt Task_II_Final_Emmanuel_5may2019-98-1 (2).txt     Task_II_Final_Emmanuel_5may2019-99-1.txt 
                                  "10:12:12"                                   "12:24:02"                                   "13:01:11" 
> 

## My notes are telling me that the 4th participant (w_timestamps[4]) was 
## in reality subject ID 97 and not 98 as shown here. 
## So I want to access its respective dataframe
## within the list (data_list), go to column "Subject", and replace the value
## from 98 to 97, and UPDATE that column in that specific dataframe.


## Starting a new pipe
data_list %>%
   {names(w_timestamps[4])}

[1] "Task_II_Final_Emmanuel_5may2019-98-1.txt" ## this isn't what I want.
                                               ## I want to use the the variable
                                               ## timestamps[4], which holds the 
                                               ## name of the list item I'm after, 
                                               ## to dig into data_list, in the relevant 
                                               ## dataframe, then go to the specific "Subject" column 
                                               ## (which is in position 13 in the dataframe)
                                               ## and replace 98 for 97 wherever 98 appears.


## using purrr::list_modify()
str(list_modify(data_list, names(w_timestamps[4]) = data.frame(Subject = 97)))

Error: unexpected '=' in "str(list_modify(data_list, names(w_timestamps[4]) ="

P.S。 -如果此帖子有问题,请告诉我要纠正的问题。

3 个答案:

答案 0 :(得分:1)

我们可以使用

i1 <- data_list[[names(w_timestamps[4])]][13] == 98
data_list[[names(w_timestamps[4])]][13][i1] <- 97

如果需要在管道上执行此操作,请使用map_atmutate_at选择list元素和该选定列表中的data.frame列

library(tidyverse)
out <- map_at(data_list, names(w_timestamps[4]), ~
           .x %>% 
              mutate_at(13, ~ replace(., .== 98, 97))) 
map(out, pluck, 13)
#$`Task_II_Final_Emmanuel_5may2019-94-1.txt`
# [1] "94" NA   NA   NA   NA   NA   NA   NA   NA   NA   NA   NA   NA   NA   NA   NA   NA   NA   NA   NA   NA  

#$`Task_II_Final_Emmanuel_5may2019-95-1.txt`
# [1] "95" NA   NA   NA   NA   NA   NA   NA   NA   NA   NA   NA   NA   NA   NA   NA   NA   NA   NA   NA   NA  

#$`Task_II_Final_Emmanuel_5may2019-96-1.txt`
# [1] "96" NA   NA   NA   NA   NA   NA   NA   NA   NA   NA   NA   NA   NA   NA   NA   NA   NA   NA   NA   NA  

#$`Task_II_Final_Emmanuel_5may2019-98-1.txt`
# [1] "97" NA   NA   NA   NA   NA   NA   NA   NA   NA   NA   NA   NA   NA   NA   NA   NA   NA   NA   NA   NA  
      # ^^^^
      #change
#$`Task_II_Final_Emmanuel_5may2019-98-1 (2).txt`
# [1] "98" NA   NA   NA   NA   NA   NA   NA   NA   NA   NA   NA   NA   NA   NA   NA   NA   NA   NA   NA   NA  

#$`Task_II_Final_Emmanuel_5may2019-99-1.txt`
# [1] "99" NA   NA   NA   NA   NA   NA   NA   NA   NA   NA   NA   NA   NA   NA   NA   NA   NA   NA   NA   NA 

答案 1 :(得分:0)

那么也许是这样?

data_list[[names(w_timestamps[4])]][13] <- replace(data_list[[names(w_timestamps[4])]][13], 
                              data_list[[names(w_timestamps[4])]][13] == 98, 97)

如果您希望将此内容包含在管道中,则可以

library(dplyr)
data_list[[names(w_timestamps[4])]] <- data_list[[names(w_timestamps[4])]] %>%
                         mutate(Subject =  replace(Subject, Subject == 98, 97))

PS-您的"Subject"列为第13列。

答案 2 :(得分:0)

您可以使用modify_at

library(tidyverse)
data_list %>% 
  modify_at(names(w_timestamps[4]), ~ mutate_at(., "Subject", ~replace(., . == 98, 97)))