我得到了一个包含数百个变量的数据集,这些变量的标签全都弄乱了。
检查了多个变量后,看起来数字似乎随机出现在标签中。
以下为您提供了一个使用Stata的auto
玩具数据集的类似示例:
Ma6k0e a6nd Mo0d5e3l
Pri1ce3
Mi1le3age3 (mpg)
Re6pa7ir R3ec8or9d 1978
He4ad3ro5om (i7n.)
Tr2un8k s9pa5ce (c4u.333ft.)
We0ig7ht (lbs.)
Len2gt4h (in.)
Tu1rn Cir9c0le (ft.)
Di7spl1ac3e7ment (cu.333in.)
Ge3a6r Ra6ti1o
Ca5r ty4pe2
如何快速清理?
答案 0 :(得分:2)
这是一种快速的方法:
sysuse auto, clear
label variable make "Ma6k0e a6nd Mo0d5e3l"
label variable price "Pri1ce3"
label variable mpg "Mi1le3age3 (mpg)"
label variable rep78 "Re6pa7ir R3ec8or9d 1978"
label variable headroom "He4ad3ro5om (i7n.)"
label variable trunk "Tr2un8k s9pa5ce (c4u.333ft.)"
label variable weight "We0ig7ht (lbs.)"
label variable length "Len2gt4h (in.)"
label variable turn "Tu1rn Cir9c0le (ft.) "
label variable displacement "Di7spl1ac3e7ment (cu.333in.)"
label variable gear_ratio "Ge3a6r Ra6ti1o"
label variable foreign "Ca5r ty4pe2"
foreach var of varlist * {
display ""
display "`: variable label `var''"
label variable `var' `"`= ustrregexra("`: variable label `var''", "[0-9]", "")'"'
display "`: variable label `var''"
}
结果:
Ma6k0e a6nd Mo0d5e3l
Make and Model
Pri1ce3
Price
Mi1le3age3 (mpg)
Mileage (mpg)
Re6pa7ir R3ec8or9d 1978
Repair Record
He4ad3ro5om (i7n.)
Headroom (in.)
Tr2un8k s9pa5ce (c4u.333ft.)
Trunk space (cu.ft.)
We0ig7ht (lbs.)
Weight (lbs.)
Len2gt4h (in.)
Length (in.)
Tu1rn Cir9c0le (ft.)
Turn Circle (ft.)
Di7spl1ac3e7ment (cu.333in.)
Displacement (cu.in.)
Ge3a6r Ra6ti1o
Gear Ratio
Ca5r ty4pe2
Car type