我正在使用k-means进行一些聚类。一旦代码建立,我想要 从excel文件导入数据。所以基本脚本运行得很好。
df <- USArrests
df <- na.omit(df)
df <- scale(df)
head(df, top = 10)
distance <- get_dist(df)
fviz_dist(distance, gradient = list(low = "#33E3FF", mid = "white", high =
"#80FF33"))
但如果我将rStudio训练数据导出为excel并将其重新导入rStudio,
我最终得出两个错误:
1)
Error in colMeans(x, na.rm = TRUE) : 'x' must be numeric
2)
Warning message:
In stats::dist(x, method = method, ...) : NAs introduced by coercion
所以这是我的脚本,它会产生错误
df <- USArrests
write.xlsx(df, "c:/my_path/USArrests.xlsx")
df <- read.xlsx(file = "c:/my_path/USArrests.xlsx", sheetIndex = 1)
df <- na.omit(df) df <- scale(df)
Error in colMeans(x, na.rm = TRUE) : 'x' must be numeric
head(df, top = 10)
NA. Murder Assault UrbanPop Rape
1 Alabama 13.2 236 58 21.2
2 Alaska 10.0 263 48 44.5
3 Arizona 8.1 294 80 31.0
4 Arkansas 8.8 190 50 19.5
5 California 9.0 276 91 40.6
6 Colorado 7.9 204 78 38.7
distance <- get_dist(df)
Warning message:
In stats::dist(x, method = method, ...) : NAs introduced by coercion
fviz_dist(distance, gradient = list(low = "#33E3FF", mid = "white", high =
"#80FF33"))
如何解决这个问题?或者如何导入fviz_dist的Excel数据?
编辑: 突出显示我如何导入和导出数据到excel:
write.xlsx(df, "c:/my_path/USArrests.xlsx")
df <- read.xlsx(file = "c:/my_path/USArrests.xlsx", sheetIndex = 1)
答案 0 :(得分:0)
虽然有点晚了但希望你现在必须解决这个问题。只是想在这里分享我的经验,以便有人能得到一些帮助。
1. Check your excel file if the data is exported properly.
2. I prefer saving it as a csv (Comma Separated file)
3. I assume that you are using Windows OS as you are trying to save in xslx file format. But do make sure to write delimiter as '/' or ',' at the time of writing the file.
希望它能解决问题。
祝你好运