假设我有一个看起来像这样的数据框。
df1:
import os
MSGABC_PATH = "/usr/bin"
MSGABC_BINARY = "msgabc"
MSGABC_COMMANDLINE_ARGS = "-A30 --before \"/etc/conf/help.txt\""
run_command("%s %s &" % (os.path.join(MSGABC_PATH, MSGABC_BINARY), MSGABC_COMMANDLINE_ARGS)
另一个看起来像这样的数据框:
df2:
ID Skill Community
1 IT X
1 Analytics X
1 ERP X
2 Analytics X
2 ERP X
2 CRM X
2 Finance X
我的目标是基本上说如果一个特定的人(由其ID标识)与 df1 的某人至少具有2个共同的技能,那么他应该也被分配给社区X。
在上面的示例中,ID 3号也应分配给社区 X (因为他具有IT和ERP技能,就像ID 1号一样),但是不是IDnº4,因为他只具有IDnº2(与财务相关)的匹配技能。
对于 df2 ,我的预期输出应如下所示:
ID Skill
3 Public Speaking
3 IT
3 Management
3 ERP
4 HR
4 Finance
...
目前,我仅将命令%in%与 df2 [df2&Skill%in%df1 $ Skill,] 一起使用,但这仅检查一项特定技能,并且不会按ID处理。
您对我应该如何解决此问题有任何想法吗?
任何帮助将不胜感激。
答案 0 :(得分:1)
请在您的真实数据集上对此进行测试,以查看以下各项是否可行。
library(dplyr)
library(tidyr)
df3 <- df2 %>%
left_join(df1, by = "Skill") %>%
drop_na(ID.y) %>%
count(ID.x, ID.y) %>%
filter(n > 1) %>%
distinct(ID.x) %>%
mutate(Community = "X") %>%
select(ID = ID.x, Community) %>%
left_join(df2, ., by = "ID")
df3
# ID Skill Community
# 1 3 Public Speaking X
# 2 3 IT X
# 3 3 Management X
# 4 3 ERP X
# 5 4 HR <NA>
# 6 4 Finance <NA>
数据
df1 <- read.table(text = "ID Skill Community
1 IT X
1 Analytics X
1 ERP X
2 Analytics X
2 ERP X
2 CRM X
2 Finance X",
header = TRUE, stringsAsFactors = FALSE)
df2 <- read.table(text = "ID Skill
3 'Public Speaking'
3 IT
3 Management
3 ERP
4 HR
4 Finance",
header = TRUE, stringsAsFactors = FALSE)
答案 1 :(得分:1)
另一个选择
public final class CollisionMatrix {
// TODO: Longs have at most 64 bits, so the current implementation fails
// when there are more than 64 tags.
private Map<Integer, Long> matrix = new HashMap<Integer, Long>();
public CollisionMatrix add(Tag tag1, Tag tag2) {
int id1 = tag1.id;
int id2 = tag2.id;
matrix.put(id1, matrix.getOrDefault(id1, 0L) | (1 << id2));
matrix.put(id2, matrix.getOrDefault(id2, 0L) | (1 << id1));
return this;
}
public CollisionMatrix remove(Tag tag1, Tag tag2) {
int id1 = tag1.id;
int id2 = tag2.id;
matrix.put(id1, matrix.getOrDefault(id1, 0L) & ~(1 << id2));
matrix.put(id2, matrix.getOrDefault(id2, 0L) & ~(1 << id1));
return this;
}
public boolean collidesWith(Tag tag1, Tag tag2) {
return 0 != (matrix.getOrDefault(tag1.id, 0L) & (1 << tag2.id));
}
}