Question

我正在处理金融行业成员的员工历史，并希望制作一份边缘清单，以便在Sankey Flow中对其进行可视化。到目前为止，我的数据是逗号分隔的实体字符串，如下所示：

A, B, D
C, A, E, B
F, B

等

特别感兴趣的是其中一家公司（例如称之为公司B）。我需要将上面的这些数据变成类似的东西：

A, B
B, D
C, B
A, B
E, B
F, B

等

同样，重要性在于公司B，因此我需要一种方法来明确地识别该因素，并处理不同长度的字符串。最后，我需要一个边缘列表，其中每一行都有公司B，数据来自公司B周围的公司，以逗号分隔的字符串。

Answer 1

在R中有几种方法可以做到这一点。这是在基础R中做到这一点的一种方法：

Getty <- "Four score and seven years ago our fathers brought forth on this continent a new nation, conceived in liberty, and dedicated to the proposition that all  men are created equal."


first.10 <- substr(Getty, start=1, stop=10)
first.10
"Four score"
split <- strsplit(first.10, split="")
split 
"F" "o" "u" "r" " " "s" "c" "o" "r" "e"

以逗号分隔的字符串给Edgelist

1 个答案: