以下是我在dataframe
中拥有的R
的子集的示例,该子集具有有关按类别-rows
,{{1 }} company_name
,no_workers,
和product
)
address
是否有一种简单的方法可以向我的contact_person
管道添加一个函数,将上面的comp_df <- structure(list(desc = c("AAA", "Company", "Ltd", "fish", "344",
"12", "West", "Road", "Bob C", "BBB", "Enteprises", "vegetables",
"12", "North", "Perak", "Simon T", "EF", "Industries", "cement",
"8800", "Green", "Lane", "Singapore", "Sylvia P"), category = c("company_name",
"company_name", "company_name", "product", "no_workers", "address",
"address", "address", "contact_person", "company_name", "company_name",
"product", "no_workers", "address", "address", "contact_person",
"company_name", "company_name", "product", "no_workers", "address",
"address", "address", "contact_person")), row.names = c(NA, -24L
), class = c("tbl_df", "tbl", "data.frame"))
转换为类似下面的内容
答案 0 :(得分:2)
假设在原始数据框中的category
列中,每个集合中company_name
的第一个值标志着一个新组的开始,您可以这样做:
library(dplyr)
library(tidyr)
comp_df %>%
group_by(category, grp = cumsum(category == "company_name" & lag(category, default = "") != "company_name")) %>%
summarise(desc = paste(desc, collapse = " ")) %>%
pivot_wider(id_cols = grp, names_from = category, values_from = desc)
# A tibble: 3 x 6
grp address company_name contact_person no_workers product
<int> <chr> <chr> <chr> <chr> <chr>
1 1 12 West Road AAA Company Ltd Bob C 344 fish
2 2 North Perak BBB Enteprises Simon T 12 vegetables
3 3 Green Lane Singapore EF Industries Sylvia P 8800 cement