我在最初在此stackoverflow thread中解决的函数中使用ifelse()时遇到了问题。在实施建议之后,代码完全按照期望执行。代码在
之下country_panel <- function(x, y) {
ifelse(cnames$time < y,
cnames[match(x, cnames$country),]$panel,
cnames[match(x, cnames$country),]$standardize
)
}
使用此
生成虚假数据 countryname <- c("Viet Nam", "Viet Nam", "Viet Nam", "Viet Nam", "Viet Nam")
year <- c(1974, 1975, 1976, 1977,1978)
df <- data.frame(countryname, year, stringsAsFactors=FALSE)
country <- c("Vietnam, North", "Vietnam, N.", "Vietnam North", "Viet Nam", "Democratic Republic Of Vietnam")
standardize <- c("Vietnam, Democratic Republic of", "Vietnam, Democratic Republic of", "Vietnam, Democratic Republic of", "Vietnam, Democratic Republic of", "Vietnam, Democratic Republic of")
panel <- c("Vietnam", "Vietnam","Vietnam","Vietnam","Vietnam")
time <- c(1976,1976,1976,1976,1976)
cnames <- data.frame(country, standardize, panel, time, stringsAsFactors = FALSE)
使用
功能评估 d1 <- df %>%
mutate(new_name = country_panel(countryname, year))
但是,当我使用实际数据实现建议时,问题会返回,其中函数不评估ifelse
语句中的条件,只返回$panel
值。
因为在stringsAsFactors = FALSE
中使用data.frame
使用假数据我认为使用read.csv(PATH, stringsAsFactors = FALSE)
会使用read_csv
而不是使用str()
,但它们都表现相同。
我还应该注意,我使用dput(head(cnames))
检查了数据框中每个向量的属性,并强制它们与我在假数据中找到的相匹配。
可以在GitHub here
上找到复制所有内容的真实数据和脚本以下是structure(list(country = c("AFGHANISTAN", "Afghanistan", "albania",
"ALBANIA", "Albania", "ALGERIA"), standardize = c("Afghanistan",
"Afghanistan", "Albania", "Albania", "Albania", "Algeria"), time = c(2015L,
2015L, 2015L, 2015L, 2015L, 2015L), panel = c("Afghanistan",
"Afghanistan", "Albania", "Albania", "Albania", "Algeria")), .Names = c("country",
"standardize", "time", "panel"), class = c("tbl_df", "data.frame"
), row.names = c(NA, -6L))
dput(head(d1))
和structure(list(countryname = c("Afghanistan", "Afghanistan",
"Afghanistan", "Afghanistan", "Afghanistan", "Afghanistan"),
year = 1970:1975), .Names = c("countryname", "year"), class = c("tbl_df",
"data.frame"), row.names = c(NA, -6L))
print_r()
答案 0 :(得分:0)
d1 <- df %>%
mutate(new_name = country_panel(countryname, year))
df2 <- structure(list(country = c("AFGHANISTAN", "Afghanistan", "albania",
"ALBANIA", "Albania", "ALGERIA"), standardize = c("Afghanistan",
"Afghanistan", "Albania", "Albania", "Albania", "Algeria"), time = c(2015L,
2015L, 2015L, 2015L, 2015L, 2015L), panel = c("Afghanistan",
"Afghanistan", "Albania", "Albania", "Albania", "Algeria")), .Names = c("country",
"standardize", "time", "panel"), class = c("tbl_df", "data.frame"
), row.names = c(NA, -6L))
d2 <- df2 %>%
mutate(new_name = country_panel(countryname, year))
这给出了:
Error: wrong result size (5), expected 6 or 1
当前问题是mutate
期望country_panel
返回6个值,因为df2
有6行(dim(df2)
),或者,它会回收1个值如所须。事实上,第一个包含补充数据的示例只能起作用,因为行数碰巧匹配。
运行后尝试再次运行示例:
debug(country_panel)
...
# after done:
undebug(country_panel)
这将为您提供调用函数的逐行视图,并且可以检查函数在运行时存在或创建的所有对象(随时随q退出)。
使用顺序匹配可能更好,而不是使用ifelse
,首先是国家/地区,然后是时间。或者您可以尝试从传递给函数的x和y向量中创建数据框,与cnames合并,然后从数据框中的条件中选择所需的名称。