如何正确使用split_rows

时间:2019-08-13 21:14:36

标签: r delimiter tidyr

我的数据是与文章作者有联系的学术机构的列表,而我正在处理的文章看起来像这样:

1   MIT
2   NBER; NBER
3   U MI; Cornell U; U VA
4   Harvard U; U Chicago
5   U OR; U CA, Davis; U British Columbia
6   World Bank; Dartmouth College; EDHEC Business School; Harvard U
7   Columbia U and IZA; Columbia U and IZA
8   World Bank; Yale U and Abdul Latif Jameel Poverty Action Lab; Dartmouth College
9   Carnegie Mellon U; Carnegie Mellon U; Carnegie Mellon U
10  Columbia U; U CA, San Diego
11  U CA, Berkeley; McMaster U; McMaster U
12  ETH Zurich and CESifo; U Copenhagen and CESifo

我想在分号(最好是在“和”处)分隔行,以便我可以找出哪些学术机构是唯一的。

我尝试通过使用tidyr软件包中的split_rows-function来做到这一点:

Affiliation<-separate_rows(Affiliation, sep=";")

或者:

Affiliation<-separate_rows(Affiliation, sep="; | and")

这些方法都不起作用,我的数据看起来完全一样。我究竟做错了什么?

在下面附加dput输出:

structure(list(AF = c("MIT", "NBER; NBER", "U MI; Cornell U; U VA", 
"Harvard U; U Chicago", "U OR; U CA, Davis; U British Columbia", 
"World Bank; Dartmouth College; EDHEC Business School; Harvard U", 
"Columbia U and IZA; Columbia U and IZA", "World Bank; Yale U and Abdul Latif Jameel Poverty Action Lab; Dartmouth College", 
"Carnegie Mellon U; Carnegie Mellon U; Carnegie Mellon U", "Columbia U; U CA, San Diego", 
"U CA, Berkeley; McMaster U; McMaster U", "ETH Zurich and CESifo; U Copenhagen and CESifo", 
"U MN, St Paul; Compass Lexecon, Washington, DC; Harvard U", 
"U WI", "U Chicago and IZA; Harvard U; Harvard U")), row.names = c(NA, 
15L), class = "data.frame")

0 个答案:

没有答案