我有以下字符串:
KZ1,345,769.1
PKS948,123.9
XG829,823.5
324JKL,282.7
456MJB87,006.01
如何区分字母和数字?
这是我期望的结果:
KZ 1345769.1
PKS 948123.9
XG 829823.5
JKL 324282.7
MJB 45687006
为此,我尝试使用split
命令,但没有成功。
答案 0 :(得分:2)
您想要的内容可以通过一个简单的正则表达式来完成:
clear
input str15 foo
"KZ1,345,769.1"
"PKS948,123.9"
"XG829,823.5"
"324JKL,282.7"
"456MJB87,006.01"
end
generate foo1 = subinstr(ustrregexra(foo, "[\d\.]", ""), ",", "", .)
generate double foo2 = real(ustrregexra(foo, "[^\d\.]", ""))
list
+------------------------------------+
| foo foo1 foo2 |
|------------------------------------|
1. | KZ1,345,769.1 KZ 1345769.1 |
2. | PKS948,123.9 PKS 948123.9 |
3. | XG829,823.5 XG 829823.5 |
4. | 324JKL,282.7 JKL 324282.7 |
5. | 456MJB87,006.01 MJB 45687006 |
+------------------------------------+
在Stata的命令提示符下键入help subinstr()
,help ustrregexra()
和help real()
将为您提供有关这些功能的用法和语法的更多详细信息。
答案 1 :(得分:1)
@Pearly Spencer的答案肯定是更可取的,但是任何程序员都应该发生以下朴素的循环。依次查看每个字符并确定是否为字母;或数字或小数点;或其他内容(隐式)并以此方式建立答案。请注意,尽管我们遍历字符串的长度,但遍历观察结果也是默认的。
clear
input str42 whatever
"KZ1,345,769.1"
"PKS948,123.9"
"XG829,823.5"
"324JKL,282.7"
"456MJB87,006.01"
end
compress
local length = substr("`: type whatever'", 4, .)
gen letters = ""
gen numbers = ""
quietly forval j = 1/`length' {
local arg substr(whatever,`j', 1)
replace letters = letters + `arg' if inrange(`arg', "A", "Z")
replace numbers = numbers + `arg' if `arg' == "." | inrange(`arg', "0", "9")
}
list
+-----------------------------------------+
| whatever letters numbers |
|-----------------------------------------|
1. | KZ1,345,769.1 KZ 1345769.1 |
2. | PKS948,123.9 PKS 948123.9 |
3. | XG829,823.5 XG 829823.5 |
4. | 324JKL,282.7 JKL 324282.7 |
5. | 456MJB87,006.01 MJB 45687006.01 |
+-----------------------------------------+