我的模拟数据框架就像......
data<-structure(list(rno = 1:8, channel = structure(c(1L, 2L, 1L, 1L,
2L, 2L, 2L, 1L), .Label = c("AAAA", "BBBB"), class = "factor")), .Names = c("rno",
"channel"), class = "data.frame", row.names = c(NA, -8L))
现在我写了一个简单的代码来在R中创建虚拟变量。
chan_names<-sort(as.character(unique(data[,"channel"])))
data10<-list()
for(i in 1:length(chan_names))
{
data10[[i]]<-data[,"channel"]==chan_names[i]
}
data10<-do.call("cbind",data10)
colnames(data10)<-chan_names
现在我想在python中执行相同的操作。我是python的初学者。帮我解决这个问题。 提前谢谢。
答案 0 :(得分:1)
使用pandas,
import pandas as pd
cats = pd.Categorical((1L, 2L, 1L, 1L, 2L, 2L, 2L, 1L), (1L, 2L))
cats.categories = ('AAAA', 'BBBB')
data = pd.DataFrame({'rno': np.arange(1,9), 'channel': cats})
# channel rno
# 0 AAAA 1
# 1 BBBB 2
# 2 AAAA 3
# 3 AAAA 4
# 4 BBBB 5
# 5 BBBB 6
# 6 BBBB 7
# 7 AAAA 8
data10 = pd.get_dummies(data['channel']).astype(bool)
产量
AAAA BBBB
0 True False
1 False True
2 True False
3 True False
4 False True
5 False True
6 False True
7 True False