我有一个字符串变量,一些响应的开头有一个额外的字符。在所有情况下,所讨论的字符都是常量字符。变量是ICD代码。例如,我有 DG23 而不是 G23。
Stata 有没有办法去掉多余的 D 字符?
我的数据是这样的
ID | diag |
---|---|
1 | DZ456 |
2 | DG32 |
3 | DY258 |
4 | DD35 |
5 | DS321 |
6 | DD21 |
7 | DA123 |
答案 0 :(得分:1)
有关此领域的基本信息,请参阅 help string functions
。
* Example generated by -dataex-. To install: ssc install dataex
clear
input byte d str5 diag
1 "DZ456"
2 "DG32"
3 "DY258"
4 "DD35"
5 "DS321"
6 "DD21"
7 "DA123"
end
replace diag = substr(diag, 2, .) if substr(diag, 1, 1) == "D"
list
+----------+
| d diag |
|----------|
1. | 1 Z456 |
2. | 2 G32 |
3. | 3 Y258 |
4. | 4 D35 |
5. | 5 S321 |
|----------|
6. | 6 D21 |
7. | 7 A123 |
+----------+
答案 1 :(得分:1)
字符串函数的替代方法是使用正则表达式,参见# Library
library(networkD3)
library(dplyr)
# Make a connection data frame
links <- data.frame(
source=c("group_A","group_A", "group_B", "group_C", "group_C", "group_E"),
target=c("group_C","group_D", "group_E", "group_F", "group_G", "group_H"),
value=c(2,3, 2, 3, 1, 3)
)
# From these flows we need to create a node data frame: it lists every entities involved in the flow
nodes <- data.frame(
name=c(as.character(links$source), as.character(links$target)) %>%
unique()
)
# With networkD3, connection must be provided using id, not using real name like in the links dataframe.. So we need to reformat it.
links$IDsource <- match(links$source, nodes$name)-1
links$IDtarget <- match(links$target, nodes$name)-1
# Add a 'group' column to each connection:
links$group <- as.factor(c("type_a","type_a","type_a","type_b","type_b","type_b"))
# Add a 'group' column to each node. Here I decide to put all of them in the same group to make them grey
nodes$group <- as.factor(c("my_unique_group"))
# Give a color for each group:
my_color <- 'd3.scaleOrdinal() .domain(["type_a", "type_b", "my_unique_group"]) .range(["#69b3a2", "steelblue", "grey"])'
# Make the Network
p <- sankeyNetwork(Links = links, Nodes = nodes, Source = "IDsource", Target = "IDtarget",
Value = "value", NodeID = "name",
colourScale=my_color, LinkGroup="group", NodeGroup="group")
p
。
help regex