从数据框创建一个列表,其长度(列表)与nrow(df)相同

时间:2017-09-13 08:44:10

标签: r

这是一个小例子数据框:

> dput(head(cluster_socrata_csv))
structure(list(Cluster = structure(c(1L, 13L, 24L, 35L, 46L, 
57L), .Label = c("cluster1", "cluster10", "cluster100", "cluster11", 
"cluster12", "cluster13", "cluster14", "cluster15", "cluster16", 
"cluster17", "cluster18", "cluster19", "cluster2", "cluster20", 
"cluster21", "cluster22", "cluster23", "cluster24", "cluster25", 
"cluster26", "cluster27", "cluster28", "cluster29", "cluster3", 
"cluster30", "cluster31", "cluster32", "cluster33", "cluster34", 
"cluster35", "cluster36", "cluster37", "cluster38", "cluster39", 
"cluster4", "cluster40", "cluster41", "cluster42", "cluster43", 
"cluster44", "cluster45", "cluster46", "cluster47", "cluster48", 
"cluster49", "cluster5", "cluster50", "cluster51", "cluster52", 
"cluster53", "cluster54", "cluster55", "cluster56", "cluster57", 
"cluster58", "cluster59", "cluster6", "cluster60", "cluster61", 
"cluster62", "cluster63", "cluster64", "cluster65", "cluster66", 
"cluster67", "cluster68", "cluster69", "cluster7", "cluster70", 
"cluster71", "cluster72", "cluster73", "cluster74", "cluster75", 
"cluster76", "cluster77", "cluster78", "cluster79", "cluster8", 
"cluster80", "cluster81", "cluster82", "cluster83", "cluster84", 
"cluster85", "cluster86", "cluster87", "cluster88", "cluster89", 
"cluster9", "cluster90", "cluster91", "cluster92", "cluster93", 
"cluster94", "cluster95", "cluster96", "cluster97", "cluster98", 
"cluster99"), class = "factor"), Socrata = structure(c(17L, 17L, 
1L, 13L, 14L, 16L), .Label = c("Assault", "Assault with Deadly Weapon", 
"Breaking and Entering", "Community Policing", "Death", "Disorder", 
"Drugs ", "Missing Person", "Other", "Other Sexual Offense", 
"Property Crime", "Property Crime Residental", "Robbery", "Theft", 
"Theft from Vehicle", "Theft of Vehicle", "Traffic", "Unknown", 
"Vehicle Recovery", "Weapons Offense"), class = "factor")), .Names = c("Cluster", 
"Socrata"), row.names = c(NA, 6L), class = "data.frame")

看起来像这样:

> head(cluster_socrata_csv)
   Cluster          Socrata
1 cluster1          Traffic
2 cluster2          Traffic
3 cluster3          Assault
4 cluster4          Robbery
5 cluster5            Theft
6 cluster6 Theft of Vehicle

我想创建一个列表,其中cluster是关键,Socrata是值。

我尝试在as.list()函数中简单地嵌套,但这返回了一个包含2个值的列表,一个用于集群,另一个用于值。

在这个例子中,我想要一个包含6个项目的列表,其中第一个项目键是cluster1,它的值是Traffic。对于第6项,它的关键是cluster6,它的价值是“盗窃车辆”。

2 个答案:

答案 0 :(得分:1)

我会想到像这样简单的事情......

B = as.list(A$Socrata)
names(B) = A$Cluster

其中A是您的数据框

如果您只想拥有一个级别的子集,可以尝试

B = as.list(droplevels(A$Socrata))

这将仅提供实际存在的级别。如果您不想要任何级别,那么我们必须通过以下方式从A $ Socrata中删除因子类:

B = as.list(as.character(A$Socrata))

答案 1 :(得分:1)

你可以这样做:

setNames(as.list(df$Socrata),df$Cluster)

setNames(as.list(as.character(df$Socrata)),df$Cluster) # not to return levels