将csv导入r时,这是什么意思(以及如何摆脱它)?

时间:2018-10-03 19:56:34

标签: r csv import

我正在将csv导入到r中,并且原始数据中不存在的所有位置都有一个帽子(一个上面带有抑扬符号/克拉的a帽子)。

有人知道他们是什么以及如何摆脱它们吗?

这是@foc建议提供的dput(head(df))结果:

 structure(list(V1 = c("", "Race3 and Hispanic Origin", "Whiteâ\200¦â\200¦â\200¦â\200¦â\200¦â\200¦â\200¦â\200¦â\200¦â\200¦â\200¦â\200¦â\200¦â\200¦â\200¦â\200¦", 
"   White, not Hispanicâ\200¦â\200¦â\200¦â\200¦â\200¦â\200¦â\200¦â\200¦â\200¦", 
"Blackâ\200¦â\200¦â\200¦â\200¦â\200¦â\200¦â\200¦â\200¦â\200¦â\200¦â\200¦â\200¦â\200¦â\200¦â\200¦â\200¦", 
"Asianâ\200¦â\200¦â\200¦â\200¦â\200¦â\200¦â\200¦â\200¦â\200¦â\200¦â\200¦â\200¦â\200¦â\200¦â\200¦â\200¦"
), V2 = c("", "", "245,985", "195,221", "41,962", "18,879"), 
    V3 = c("", "", "27,113", "17,263", "9,234", "1,908"), V4 = c("", 
    "", "547", "493", "388", "175"), V5 = c("", "", "11.0", "8.8", 
    "22.0", "10.1"), V6 = c("", "", "0.2", "0.3", "0.9", "0.9"
    ), V7 = c("", "", "247,272", "195,256", "42,474", "19,475"
    ), V8 = c("", "", "26,436", "16,993", "8,993", "1,953"), 
    V9 = c("", "", "714", "571", "373", "190"), V10 = c("", "", 
    "10.7", "8.7", "21.2", "10.0"), V11 = c("", "", "0.3", "0.3", 
    "0.9", "1.0"), V12 = c("", "", "-677", "-270", "-241", "45"
    ), V13 = c("", "", "*-0.3", "-0.1", "-0.8", "-0.1")), row.names = c(NA, 
6L), class = "data.frame")

1 个答案:

答案 0 :(得分:1)

不确定这是否是您想要的:

数据示例:

netcoreapp2.0

删除字符:

df <- structure(list(V1 = c("", "Race3 and Hispanic Origin", "Whiteâ\200¦â\200¦â\200¦â\200¦â\200¦â\200¦â\200¦â\200¦â\200¦â\200¦â\200¦â\200¦â\200¦â\200¦â\200¦â\200¦", 
                            "   White, not Hispanicâ\200¦â\200¦â\200¦â\200¦â\200¦â\200¦â\200¦â\200¦â\200¦", 
                            "Blackâ\200¦â\200¦â\200¦â\200¦â\200¦â\200¦â\200¦â\200¦â\200¦â\200¦â\200¦â\200¦â\200¦â\200¦â\200¦â\200¦", 
                            "Asianâ\200¦â\200¦â\200¦â\200¦â\200¦â\200¦â\200¦â\200¦â\200¦â\200¦â\200¦â\200¦â\200¦â\200¦â\200¦â\200¦"
), V2 = c("", "", "245,985", "195,221", "41,962", "18,879"), 
V3 = c("", "", "27,113", "17,263", "9,234", "1,908"), V4 = c("", 
                                                             "", "547", "493", "388", "175"), V5 = c("", "", "11.0", "8.8", 
                                                                                                     "22.0", "10.1"), V6 = c("", "", "0.2", "0.3", "0.9", "0.9"
                                                                                                     ), V7 = c("", "", "247,272", "195,256", "42,474", "19,475"
                                                                                                     ), V8 = c("", "", "26,436", "16,993", "8,993", "1,953"), 
V9 = c("", "", "714", "571", "373", "190"), V10 = c("", "", 
                                                    "10.7", "8.7", "21.2", "10.0"), V11 = c("", "", "0.3", "0.3", 
                                                                                            "0.9", "1.0"), V12 = c("", "", "-677", "-270", "-241", "45"
                                                                                            ), V13 = c("", "", "*-0.3", "-0.1", "-0.8", "-0.1")), row.names = c(NA, 
                                                                                                                                                                6L), class = "data.frame")

结果:

df[] <- lapply(df, gsub, pattern='a€¦', replacement='')