Question

我有一份需要转换为标准格式（iso3c）的国家/地区列表。有些人有长名字，有些人有2或3位数字代码，有些则没有显示整个国家名称，如“非洲”而不是“南非”。我做了一些研究，然后在R中使用countrycode包。但是，当我试图使用“正则表达式”时，R似乎并没有认识到它。我得到以下错误：

> countrycode(data,"regex","iso3c", warn = TRUE)
Error in countrycode(data, "regex", "iso3c",  : 
Origin code not supported

我需要做的其他选择吗？

谢谢！

Answer 1

您可以在此处查看国家/地区代码包的自述文件https://github.com/vincentarelbundock/countrycode，也可以通过在R控制台?countrycode::countrycode中输入该文件来提取R中的帮助文件。

“regex”不是有效的“origin”值（countrycode()函数中的第二个参数）。你必须使用“cowc”，“cown”，“eurostat”，“fao”，“fips105”，“imf”，“ioc”，“iso2c”，“iso3c”，“iso3n”，“p4_ccode”，“ p4_scode“，”un“，”wb“，”wb_api2c“，”wb_api3c“，”wvs“，”country.name“，”country.name.de“（使用最新版本0.19）。

如果您使用以下任一“原始”值，则会自动执行正则表达式匹配：“country.name”或“country.name.de”

如果您使用带有新版本（自0.19版本）custom_dict参数的自定义词典，则必须将origin_regex参数设置为TRUE才能进行正则表达式匹配。< / p>

在你的例子中，这应该做你想要的： countrycode(data, origin = "country.name", destination = "iso3c", warn = TRUE)

R：Countrycode包不支持正则表达式作为原点

1 个答案: