Question

我的选举结果数据范围广。我需要在新的专栏中说明一个特定政党获得了多少票。记录投票的方式意味着我需要遍历许多列来执行此操作。我可以看到如何使用for循环来执行此操作，但是我想使用purrr使其工作。

下面是数据的示例：

df <- data.frame(district = c("A", "B"),
                 party1 = c("Lab", "Con"), 
                 votes1 = c(188, 200),
                 party2 = c("LD", "Lab"),
                 votes2 = c(140, 164),
                 party3 = c("Con", "LD"),
                 votes3 = c(23, 99))

我想创建一个新列，记录“ LD”政党获得的票数。因此，在此示例中：

df$LD_votes <- c(140,99)

我绑了这个，但没有成功：

df <- df %>% map(1:34, function(x) mutate(LD_votes = ifelse(paste0(party, x)=="LD", paste0(votes, x), NA)))

如何使这些代码正常工作？

Answer 1

在这里，我有一个def fprime(x,y): return sym.diff(f(x,y),x) print(fprime(x,1))解决方案。首先，我们将data.table转换为data.table：

df

接下来，我将df从宽格式转换为长格式library(data.table) df <- data.frame(district = c("A", "B"), party1 = c("Lab", "Con"), votes1 = c(188, 200), party2 = c("LD", "Lab"), votes2 = c(140, 164), party3 = c("Con", "LD"), votes3 = c(23, 99)) setDT(df)# converting to data.table，以便我们可以根据“区”和“聚会”来汇总“票”

返回哪个

x <- melt(df,id.vars = "district",  # Melting data to long
     measure.vars = patterns("^party", "^votes"),
     value.name = c("party", "votes"))

现在，我计算#Displaying x x district variable party votes 1: A 1 Lab 188 2: B 1 Con 200 3: A 2 LD 140 4: B 2 Lab 164 5: A 3 Con 23 6: B 3 LD 99-根据地区和政党的总和，并仅过滤所需的“ LD”政党。

最后，我将y <- x[party=="LD", .(SumV=sum(votes)), .(district, party)]到SumV的{{1}}列附加到df。我正在根据y对df进行排序，以避免将LD总数分配给不同的地区。

与Con和Lab派对类似

district

Answer 2

使用此

df <- df %>% 
  mutate("LDVotes" = (ifelse(party1 == "LD", votes1, 0) + ifelse(party2 == "LD", votes2, 0) + ifelse(party3 == "LD", votes3, 0)),
         "LabVotes" = (ifelse(party1 == "Lab", votes1, 0) + ifelse(party2 == "Lab", votes2, 0) + ifelse(party3 == "Lab", votes3, 0)),
         "ConVotes" = (ifelse(party1 == "Con", votes1, 0) + ifelse(party2 == "Con", votes2, 0) + ifelse(party3 == "Con", votes3, 0)))

Answer 3

这是一种整齐的方法，适用于许多列对。

library(tidyverse)
df1 <- df %>%
  rowid_to_column(var = "orig_row") %>%
  gather(col, val, -c(orig_row, district)) %>%
  arrange(orig_row) %>%
  group_by(orig_row) %>%
  mutate(grp_num = (1 + row_number()) %/% 2,
         col = str_remove(col, "[0-9]")) %>%
  ungroup() %>%
  spread(col, val) %>%
  mutate(votes = parse_number(votes))

df1 %>% count(party, district, wt = votes)

Answer 4

也许可以更好地进行内联，但这行得通。

library(tidyverse)

df <- data.frame(district = c("A", "B"),
                 party1 = c("Lab", "Con"),
                 votes1 = c(188, 200),
                 party2 = c("LD", "Lab"),
                 votes2 = c(140, 164),
                 party3 = c("Con", "LD"),
                 votes3 = c(23, 99))

party <- df %>%
    select(district, starts_with("party")) %>%
    gather(key="col", value="party", starts_with("party"))
votes <- df %>%
    select(district, starts_with("votes")) %>%
    gather(key="col", value="votes", starts_with("votes"))
result <- party %>%
    select(-col) %>% 
    mutate(votes=votes$votes) %>% 
    group_by(party, district) %>% 
    summarise(total=sum(votes))

> result
  party district total
1 Con   A           23
2 Con   B          200
3 Lab   A          188
4 Lab   B          164
5 LD    A          140
6 LD    B           99

满足条件时，创建等于许多现有列之一的新列

4 个答案: