我在文件中的变量之一具有以下格式:
Bachelor of Commerce - AD - Accounting-Maj
Bachelor of Commerce - Finance-Maj
Bachelor of Commerce - Finance-Maj/Accounting-Min
BSc with Specialization - Math & Finance-Maj
BSc in Agric/Food Bus Mngmnt - Agric Business Management-Maj
Bachelor of Commerce - Management Info Systems-Maj
我想做的是在-
符号之前取字符串的第一部分。
例如,从前三行我需要获得Bachelor of Commerce
。
如果有人能告诉我最简单的方法,我将不胜感激。
答案 0 :(得分:3)
尝试此操作,假设您的变量名为string_var
:
split string_var, parse(" -") limit(1) gen(substring_before_first_hyphen)
答案 1 :(得分:2)
对于将来的问题,请发布尝试的代码以及为什么它不适合您。一些用户认为仅询问代码的问题是偏离主题的。
这是一种方式:
clear all
set more off
*----- example data -----
set obs 2
gen degree = "Bachelor of Commerce - AD - Accounting-Maj"
replace degree = "Bachelor of Something" in 2
list
*----- what you want -----
gen degree2 = trim(substr(degree, 1, strpos(degree, "-") - 1))
replace degree2 = degree if missing(degree2)
list
这将从位置1开始采用变量degree
的子字符串,并在找到第一个-
的位置(减1)结束。 trim()
将修剪任何前导或尾随空白。如果原始变量中没有-
,则会生成缺失,因此replace
已就位。
有关可用于操作字符串的函数数组,请参阅help string functions
。
答案 2 :(得分:2)
使用substring
和split
的先前答案在Stata中可能更好。我发布正则表达式解决方案只是为了完整性
clear
input strL degree
"Bachelor of Commerce - AD - Accounting-Maj"
"Bachelor of Commerce - Finance-Maj"
"Bachelor of Commerce - Finance-Maj/Accounting-Min"
"BSc with Specialization - Math & Finance-Maj"
"BSc in Agric/Food Bus Mngmnt - Agric Business Management-Maj"
"Bachelor of Commerce - Management Info Systems-Maj"
end
gen str=regexs(0) if regexm(degree,"^[^\-]*")==1
list str
答案 3 :(得分:1)
还可以使用egen
命令及其ends()
函数和关联的punct
选项:
clear
input strL string
"Bachelor of Commerce - AD - Accounting-Maj"
"Bachelor of Commerce - Finance-Maj"
"Bachelor of Commerce - Finance-Maj/Accounting-Min"
"BSc with Specialization - Math & Finance-Maj"
"BSc in Agric/Food Bus Mngmnt - Agric Business Management-Maj"
"Bachelor of Commerce - Management Info Systems-Maj"
end
egen new_string = ends(string), punct(-)
list new_string
+-------------------------------+
| new_string |
|-------------------------------|
1. | Bachelor of Commerce |
2. | Bachelor of Commerce |
3. | Bachelor of Commerce |
4. | BSc with Specialization |
5. | BSc in Agric/Food Bus Mngmnt |
|-------------------------------|
6. | Bachelor of Commerce |
+-------------------------------+
答案 4 :(得分:0)
String course = Bachelor of Commerce - AD - Accounting-Maj;
如果你想获得之前的' - '字符串使用
String requiredSubString = course.split("-")[0];
在上面的代码拆分方法中返回stings数组,由' - '分隔。然后,您可以通过索引获取所需的子字符串。所以这里我们得到0个索引字符串,由 - 字符分隔。 即商业学士