如何将长格式数据帧转换为R中单元格内具有多个值的宽格式数据帧?

时间:2016-03-28 22:12:31

标签: r dataframe reshape

这是原始的df:

    area sector      item
1   East      A      <NA>
2  South      A     Baidu
3  South      A   Tencent
4   West      A      <NA>
5  North      A      <NA>
6   East      B Microsoft
7   East      B    Google
8   East      B  Facebook
9  South      B      <NA>
10  West      B      <NA>
11 North      B      <NA>
12  East      C      <NA>
13 South      C      <NA>
14  West      C      <NA>
15 North      C   Alibaba
16  East      D      <NA>
17 South      D      <NA>
18  West      D    Amazon
19 North      D      <NA>
20  East      E      <NA>
21 South      E      <NA>
22  West      E      <NA>
23 North      E      <NA>

如何将上述df转换为以下df?转换后的df中的某些单元格具有原始df中的多个项目。

  Sector                     East            South     West     North
1 A                          <NA> "Baidu, Tencent"     <NA>      <NA>
2 B "Microsoft, Google, Facebook"             <NA>     <NA>      <NA>
3 C                          <NA>             <NA>     <NA> "Alibaba"
4 D                          <NA>             <NA> "Amazon"      <NA>
5 E                          <NA>             <NA>     <NA>      <NA>

3 个答案:

答案 0 :(得分:2)

快速解决方案可能是使用toString函数,同时使用reshape2包从长到宽进行转换

reshape2::dcast(df, sector ~ area, toString)
#Using item as value column: use value.var to override.
#   sector                        East   North          South   West
# 1      A                        <NA>    <NA> Baidu, Tencent   <NA>
# 2      B Microsoft, Google, Facebook    <NA>           <NA>   <NA>
# 3      C                        <NA> Alibaba           <NA>   <NA>
# 4      D                        <NA>    <NA>           <NA> Amazon
# 5      E                        <NA>    <NA>           <NA>   <NA>

这几乎是this的欺骗,但大多数解决方案都不适合这种情况 - 但这仍然可以给你一些想法。

答案 1 :(得分:2)

只是为了好玩,这是一个基本解决方案:

OnSharedPreferenceChangeListener()

答案 2 :(得分:1)

以下是dplyr/tidyr

的选项
library(dplyr)
library(tidyr)
df1 %>%
   group_by(area, sector) %>% 
   summarise(item = toString(item)) %>% 
   spread(area, item)