如果数据框具有该列,则从大列表中删除列

时间:2020-06-10 02:41:27

标签: r list

我使用了包装制表器中的extract_tables来提取165页表。每个页面都在大列表中格式化为自己的数据框。 PDF中的表格有5列。有些页面的格式不正确,只有4列。

我想将所有数据框组合成一个数据框,但是我不能,因为列号不同。

第五列是不必要的,所以我在修改map_if函数

map_if(df, ~.[,5], ~ select(-c(,5)))

但是那不起作用。

编辑: 为了简化问题,我将复制并粘贴输出数据的简化版本。

使用typeof(),我的数据是一个列表,使用缩短后的数据集的length()的长度为7。str()返回以下值:

List of 7

 $ : chr [1:34, 1:4] "Species" "Abelmoschus\t\r  esculentus(\t\r  L.)\t\r  Moench" "Abelmoschus\t\r  esculentus(\t\r  L.)\t\r  Moench" "Abelmoschus\t\r  ficulneus(\t\r  \t\r  L.)\t\r  Wight\t\r  &\t\r  Arn." ...

$ : chr [1:34, 1:4] "Species" "Abrus\t\r  precatorius\t\r  L." "Abrus\t\r  precatorius\t\r  L." "Abrus\t\r  precatorius\t\r  L." ...

$ : chr [1:34, 1:4] "Species" "Acanthocalyx\t\r  alba(\t\r  Hand.-­â\200\220Mazz.)\t\r  M.J.Cannon" "Acanthus\t\r  ilicifolius\t\r  L." "Achillea\t\r  millefolium\t\r  L." ...

$ : chr [1:34, 1:4] "Species" "Achyranthes\t\r  bidentata\t\r  Blume" "Achyranthes\t\r  bidentata\t\r  Blume" "Achyranthes\t\r  bidentata\t\r  Blume" ...

$ : chr [1:34, 1:4] "Species" "Adhatoda\t\r  vasica\t\r  Nees" "Adhatoda\t\r  vasica\t\r  Nees" "Adhatoda\t\r  vasica\t\r  Nees" ...

$ : chr [1:34, 1:4] "Species" "Aganosma\t\r  marginata(\t\r  Roxb.)\t\r  G.Don" "Aganosma\t\r  marginata(\t\r  Roxb.)\t\r  G.Don" "Aganosma\t\r  sp." ...

$ : chr [1:34, 1:5] "Species" "Ailanthus\t\r  triphysa(\t\r  Dennst.)\t\r  Alston" "Ainsliaea\t\r  \t\r  spicata\t\r  Vaniot" "Akebia\t\r  quinata(\t\r  Houtt.)\t\r  Decne." ...

dput的输出(pdf.dat [1:2])

list(structure(c("Species", "Abelmoschus\t\r  esculentus(\t\r  L.)\t\r  Moench", 
"Abelmoschus\t\r  esculentus(\t\r  L.)\t\r  Moench", "Abelmoschus\t\r  ficulneus(\t\r  \t\r  L.)\t\r  Wight\t\r  &\t\r  Arn.", 
"Abelmoschus\t\r  manihot(\t\r  L.)\t\r  Medik.", "Abelmoschus\t\r  manihot(\t\r  L.)\t\r  Medik.", 
"Abelmoschus\t\r  manihot(\t\r  L.)\t\r  Medik.", "Abelmoschus\t\r  manihot(\t\r  L.)\t\r  Medik.", 
"Abelmoschus\t\r  manihot(\t\r  L.)\t\r  Medik.", "Abelmoschus\t\r  manihot(\t\r  L.)\t\r  Medik.", 
"Abelmoschus\t\r  manihot(\t\r  L.)\t\r  Medik.", "Abelmoschus\t\r  manihot(\t\r  L.)\t\r  Medik.", 
"Abelmoschus\t\r  moschatus\t\r  Medik.", "Abelmoschus\t\r  moschatus\t\r  Medik.", 
"Abelmoschus\t\r  sagittifolius(\t\r  Kurz)\t\r  Merr.", "Abelmoschus\t\r  sagittifolius(\t\r  Kurz)\t\r  Merr.", 
"Abroma\t\r  augusta(\t\r  L.)\t\r  L.\t\r  f.", "Abroma\t\r  augusta(\t\r  L.)\t\r  L.\t\r  f.", 
"Abroma\t\r  augusta(\t\r  L.)\t\r  L.\t\r  f.", "Abroma\t\r  augusta(\t\r  L.)\t\r  L.\t\r  f.", 
"Abroma\t\r  augusta(\t\r  L.)\t\r  L.\t\r  f.", "Abroma\t\r  augusta(\t\r  L.)\t\r  L.\t\r  f.", 
"Abroma\t\r  augusta(\t\r  L.)\t\r  L.\t\r  f.", "Abroma\t\r  augusta(\t\r  L.)\t\r  L.\t\r  f.", 
"Abrus\t\r  precatorius\t\r  L.", "Abrus\t\r  precatorius\t\r  L.", 
"Abrus\t\r  precatorius\t\r  L.", "Abrus\t\r  precatorius\t\r  L.", 
"Abrus\t\r  precatorius\t\r  L.", "Abrus\t\r  precatorius\t\r  L.", 
"Abrus\t\r  precatorius\t\r  L.", "Abrus\t\r  precatorius\t\r  L.", 
"Abrus\t\r  precatorius\t\r  L.", "Abrus\t\r  precatorius\t\r  L.", 
"Family", "Malvaceae", "Malvaceae", "Malvaceae", "Malvaceae", 
"Malvaceae", "Malvaceae", "Malvaceae", "Malvaceae", "Malvaceae", 
"Malvaceae", "Malvaceae", "Malvaceae", "Malvaceae", "Malvaceae", 
"Malvaceae", "Malvaceae", "Malvaceae", "Malvaceae", "Malvaceae", 
"Malvaceae", "Malvaceae", "Malvaceae", "Malvaceae", "Fabaceae", 
"Fabaceae", "Fabaceae", "Fabaceae", "Fabaceae", "Fabaceae", "Fabaceae", 
"Fabaceae", "Fabaceae", "Fabaceae", "Use", "Hysteritis", "Blenorrhagia", 
"Contraceptive", "Parturition", "Menorrhagia", "Parturition(\t\r  difficult)", 
"Female\t\r  fertility", "Parturition(\t\r  induces\t\r  labour)", 
"Lactagogue", "Blenorrhagia", "Postpartum\t\r  recovery", "Gynaecological\t\r  diseases", 
"Lactagogue", "Blenorrhagia", "Leucorrhea", "Dysmenorrhea", "uterine\t\r  diseases", 
"Leucorrhea", "Menstrual\t\r  disorders", "Amenorrhea", "Dysmenorrhea", 
"Emmenagogue", "Dysmenorrhea", "Antifertility/prevent\t\r  conception", 
"Abortifacient", "Contraception", "Amenorrhegia", "Neonatal\t\r  bath", 
"Contraceptive", "Abortifacient", "Abortifacient", "Abortifacient", 
"Abortifacient", "Use(\t\r  standardized)\t\r   Study", "Inflammation Kishore\t\r  et\t\r  al.(\t\r  1989)", 
"Leucorrhea Pételot(\t\r  1952)", "Contraceptive Bhogaonkar\t\r  and\t\r  Kadam(\t\r  2011)", 
"Other/NOS Bourdy\t\r  and\t\r  Walter(\t\r  1992)", "Uterine\t\r  hemorrhage Bourdy\t\r  and\t\r  Walter(\t\r  1992)", 
"Parturition\t\r   Girard\t\r  and\t\r  Barrau(\t\r  1957)", 
"Fertility Holdsworth(\t\r  1975)", "Uterine\t\r  contractions(\t\r  induce) Holdsworth(\t\r  1980)", 
"Lactation(\t\r  stimulate) Ishidoya(\t\r  1933-­â\200\2201937)", 
"Leucorrhea Roi(\t\r  1955)", "Postpartum\t\r  recovery Roosita\t\r  et\t\r  al.(\t\r  2008)", 
"Gynecological\t\r  disorders\t\r  NOS Van\t\r  Duong(\t\r  1993)", 
"Lactation(\t\r  stimulate) Zhang\t\r  et\t\r  al.(\t\r  2009)", 
"Leucorrhea Pételot(\t\r  1952)", "Leucorrhea Pételot(\t\r  1952)", 
"Menstrual\t\r  pain Guerrero(\t\r  1922)", "Gynecological\t\r  disorders\t\r  NOS Hossan\t\r  et\t\r  al.(\t\r  2010)", 
"Leucorrhea Hossan\t\r  et\t\r  al.(\t\r  2010)", "Menstrual\t\r  disorders\t\r  NOS Hossan\t\r  et\t\r  al.(\t\r  2010)", 
"Menstrual\t\r  flow(\t\r  absent) Pardo\t\r  de\t\r  Tavera\t\r  and\t\r  Thomas(\t\r  1901)", 
"Menstrual\t\r  pain Pardo\t\r  de\t\r  Tavera\t\r  and\t\r  Thomas(\t\r  1901)", 
"Menstrual\t\r  flow(\t\r  stimulate) Pételot(\t\r  1952)", 
"Menstrual\t\r  pain Quisumbing(\t\r  1951)", "Contraceptive Behera(\t\r  2006)", 
"Abortion(\t\r  induce) Bhattarai(\t\r  1994)", "Contraceptive Bhattarai(\t\r  1994)", 
"Menstrual\t\r  flow(\t\r  absent) Bhogaonkar\t\r  and\t\r  Kadam(\t\r  2011)", 
"Other/NOS Fox(\t\r  1953)", "Contraceptive Goswami\t\r  et\t\r  al.(\t\r  2011)", 
"Abortion(\t\r  induce) Guha\t\r  et\t\r  al.(\t\r  2003)", "Abortion(\t\r  induce) Jain\t\r  et\t\r  al.(\t\r  2004)", 
"Abortion(\t\r  induce) Kalita\t\r  et\t\r  al.(\t\r  2011)", 
"Abortion(\t\r  induce) Kishore\t\r  et\t\r  al.(\t\r  1989)"
), .Dim = c(34L, 4L)), structure(c("Species", "Abrus\t\r  precatorius\t\r  L.", 
"Abrus\t\r  precatorius\t\r  L.", "Abrus\t\r  precatorius\t\r  L.", 
"Abrus\t\r  precatorius\t\r  L.", "Abrus\t\r  precatorius\t\r  L.", 
"Abrus\t\r  precatorius\t\r  L.", "Abrus\t\r  precatorius\t\r  L.", 
"Abrus\t\r  precatorius\t\r  L.", "Abrus\t\r  precatorius\t\r  L.", 
"Abrus\t\r  precatorius\t\r  L.", "Abutilon\t\r  indicum(\t\r  \t\r  L.)\t\r  Sweet", 
"Abutilon\t\r  indicum(\t\r  \t\r  L.)\t\r  Sweet", "Abutilon\t\r  indicum(\t\r  L.)\t\r  Sweet", 
"Abutilon\t\r  indicum(\t\r  L.)\t\r  Sweet", "Acacia\t\r  catechu(\t\r  L.\t\r  f.)\t\r  Willd.", 
"Acacia\t\r  catechu(\t\r  L.f.)\t\r  Willd.", "Acacia\t\r  concinna(\t\r  Willd.)\t\r  DC.", 
"Acacia\t\r  concinna(\t\r  Willd.)\t\r  DC.", "Acacia\t\r  farnesiana(\t\r  \t\r  L.)\t\r  Willd.", 
"Acacia\t\r  farnesiana(\t\r  \t\r  L.)\t\r  Willd.", "Acacia\t\r  farnesiana(\t\r  \t\r  L.)\t\r  Willd.", 
"Acacia\t\r  farnesiana(\t\r  L.)\t\r  Willd.", "Acacia\t\r  farnesiana(\t\r  L.)\t\r  Willd.", 
"Acacia\t\r  farnesiana(\t\r  L.)\t\r  Willd.", "Acacia\t\r  leucophloeia(\t\r  Roxb.)\t\r  Willd.", 
"Acacia\t\r  leucophloeia(\t\r  Roxb.)\t\r  Willd.", "Acacia\t\r  nilotica(\t\r  L.)\t\r  Delile", 
"Acacia\t\r  nilotica(\t\r  L.)\t\r  Delile", "Acacia\t\r  nilotica(\t\r  L.)\t\r  Delile", 
"Acalypha\t\r  grandis\t\r  Benth.", "Acalypha\t\r  spiciflora\t\r  Burm.f.", 
"Acalypha\t\r  spiciflora\t\r  Burm.f.", "Acanthocalyx\t\r  alba(\t\r  Hand.-­â\200\220Mazz.)\t\r  M.J.Cannon", 
"Family", "Fabaceae", "Fabaceae", "Fabaceae", "Fabaceae", "Fabaceae", 
"Fabaceae", "Fabaceae", "Fabaceae", "Fabaceae", "Fabaceae", "Malvaceae", 
"Malvaceae", "Malvaceae", "Malvaceae", "Fabaceae", "Fabaceae", 
"Fabaceae", "Fabaceae", "Fabaceae", "Fabaceae", "Fabaceae", "Fabaceae", 
"Fabaceae", "Fabaceae", "Fabaceae", "Fabaceae", "Fabaceae", "Fabaceae", 
"Fabaceae", "Euphorbiaceae", "Euphorbiaceae", "Euphorbiaceae", 
"Caprifoliaceae", "Use", "Contraceptive", "Female\t\r  fertility", 
"Leucorrhea", "Abortifacient", "Contraceptive", "Antifertility", 
"Postpartum\t\r  recovery", "Contraceptive", "Abortifacient", 
"menstrual\t\r  disorders", "menstrual\t\r  disorders", "Leucorrhea", 
"Urinary\t\r  tract\t\r  infections", "Uterus\t\r  displacement", 
"Abortifacient", "Abortifacient", "Postpartum", "Postpartum", 
"Leucorrhea", "Leucorrhea", "Menorrhagia", "Postpartum\t\r  protective", 
"Leucorrhea", "Gynaecological\t\r  diseases", "Contraceptive", 
"Amenorrhea", "Contraction\t\r  of\t\r  uterus\t\r  in\t\r  post-­â\200\220natal\t\r  days", 
"Menstrual\t\r  pain\t\r  relief", "Leucorrhea", "Contraceptive", 
"postpartum\t\r  anemia", "expel\t\r  lochia", "Gynaecological\t\r  diseases", 
"Use(\t\r  standardized)\t\r   Study", "Contraceptive Pal\t\r  and\t\r  Jain(\t\r  1998),\t\r  Lodha", 
"Fertility Pal\t\r  and\t\r  Jain(\t\r  1998),\t\r  Lodha", "Leucorrhea Pal\t\r  and\t\r  Jain(\t\r  1998),\t\r  Lodha", 
"Abortion(\t\r  induce) Panduranga\t\r  et\t\r  al.(\t\r  2011)", 
"Contraceptive Panduranga\t\r  et\t\r  al.(\t\r  2011)", "Contraceptive Priya\t\r  et\t\r  al.(\t\r  2002)", 
"Postpartum\t\r  recovery Roosita\t\r  et\t\r  al.(\t\r  2008)", 
"Contraceptive Tripathi\t\r  et\t\r  al.(\t\r  2010)", "Abortion(\t\r  induce) Van\t\r  Duong(\t\r  1993)", 
"Menstrual\t\r  disorders\t\r  NOS Vidyasagar\t\r  and\t\r  Prashantkumar(\t\r  2007)", 
"Menstrual\t\r  disorders\t\r  NOS Panduranga\t\r  et\t\r  al.(\t\r  2011)", 
"Leucorrhea Yadav\t\r  et\t\r  al.(\t\r  2006)", "Urinary\t\r  tract\t\r  infections Lecomte\t\r  et\t\r  al.(\t\r  1907)", 
"Uterine\t\r  prolapse Mohapatra\t\r  and\t\r  Sahoo(\t\r  2008)", 
"Abortion(\t\r  induce) Jain\t\r  et\t\r  al.(\t\r  2004)", "Abortion(\t\r  induce) Bhattarai(\t\r  1994)", 
"Other/NOS Anderson(\t\r  1993),\t\r  Hmong", "Other/NOS Anderson(\t\r  1993),\t\r  Karen", 
"Leucorrhea Pételot(\t\r  1952)", "Leucorrhea Tripathi\t\r  et\t\r  al.(\t\r  2010)", 
"Uterine\t\r  hemorrhage Tripathi\t\r  et\t\r  al.(\t\r  2010)", 
"Other/NOS Gimlette(\t\r  1930)", "Leucorrhea Pardo\t\r  de\t\r  Tavera\t\r  and\t\r  Thomas(\t\r  1901)", 
"Gynecological\t\r  disorders\t\r  NOS Van\t\r  Duong(\t\r  1993)", 
"Contraceptive Jain\t\r  et\t\r  al.(\t\r  2004)", "Menstrual\t\r  flow(\t\r  absent) Jain\t\r  et\t\r  al.(\t\r  2004)", 
"Postpartum\t\r  uterus\t\r  reduction Bhattarai(\t\r  1994)", 
"Menstrual\t\r  pain Pal\t\r  and\t\r  Jain(\t\r  1998),\t\r  Lodha", 
"Leucorrhea Yadav\t\r  et\t\r  al.(\t\r  2006)", "Contraceptive Bourdy\t\r  and\t\r  Walter(\t\r  1992)", 
"Anemia Panyaphu\t\r  et\t\r  al.(\t\r  2011)", "Uterine\t\r  contractions(\t\r  induce) Panyaphu\t\r  et\t\r  al.(\t\r  2011)", 
"Gynecological\t\r  disorders\t\r  NOS Liu\t\r  et\t\r  al.(\t\r  2009)"
), .Dim = c(34L, 4L)))

1 个答案:

答案 0 :(得分:1)

如果您的列表名为list_df,则可以select的前4列:

library(dplyr)
all_data <- purrr::map_df(pdf.dat,~as.data.frame(.x) %>% select(1:4))

或在基数R中:

all_data <- do.call(rbind, lapply(pdf.dat, function(x) data.frame(x)[1:4]))