Let's say you have a data set that looks like this:
Vietnam Gulf War Iraq War
veteran1 1 0 0
veteran2 0 1 0
veteran3 0 0 1
veteran4 0 1 1 # <---- Note this row
You want to consolidate these columns without affecting other columns in the dataframe like so:
Service
veteran1 1
veteran2 2
veteran3 3
veteran4 2 # <---- Note this row
Where
1 = Vietnam
, 2 = Gulf War
, 3 = Iraq War
veteran4
where it picked their left-most column)Questions:
How would you do this in R
?
(Note: if it's easier to do in some other free open source program, please feel free to share which program and how you would do it. This is a massive dataset: 3 million rows, the American Community Survey.)
答案 0 :(得分:3)
查看您的数据,似乎是一个简单的问题:
如果越南&gt; 0,然后使用1,否则如果海湾战争&gt; 0然后2,否则如果伊拉克&gt; 0然后是3,否则为0
vietnam = c(1, 0, 0,0)
gulfwar = c(0,1,0,1)
iraq = c(0,0,1,1)
df = data.frame(vietnam, gulfwar, iraq)
df$service <- ifelse(df$vietnam > 0,1,ifelse(df$gulfwar>0,2,ifelse(df$iraq>0,3,0)))
df
结果:
vietnam gulfwar iraq service
1 1 0 0 1
2 0 1 0 2
3 0 0 1 3
4 0 1 1 2
答案 1 :(得分:1)
可能有点复杂(关于其他解决方案),但这是使用df$service <- apply(df, 1, function(x) which(x == 1)[1] )
df
vietnam gulfwar iraq service
1 1 0 0 1
2 0 1 0 2
3 0 0 1 3
4 0 1 1 2
的一种方法:
X