rStudio:colMeans中的聚类错误(x,na.rm = TRUE):'x'必须是数字

时间:2017-08-09 19:18:14

标签: r

我正在使用k-means进行一些聚类。一旦代码建立,我想要 从excel文件导入数据。所以基本脚本运行得很好。

df <- USArrests  
df <- na.omit(df)  
df <- scale(df)  
head(df, top = 10)  
distance <- get_dist(df)  
fviz_dist(distance, gradient = list(low = "#33E3FF", mid = "white", high = 
"#80FF33"))  

但如果我将rStudio训练数据导出为ex​​cel并将其重新导入rStudio,  我最终得出两个错误:
1)

Error in colMeans(x, na.rm = TRUE) : 'x' must be numeric  

2)

Warning message:  
In stats::dist(x, method = method, ...) : NAs introduced by coercion  

所以这是我的脚本,它会产生错误

df <- USArrests  
write.xlsx(df, "c:/my_path/USArrests.xlsx")  
df <- read.xlsx(file = "c:/my_path/USArrests.xlsx", sheetIndex = 1)  
df <- na.omit(df)  df <- scale(df)  
Error in colMeans(x, na.rm = TRUE) : 'x' must be numeric  
head(df, top = 10)  
NA. Murder Assault UrbanPop Rape  
1    Alabama   13.2     236       58 21.2  
2     Alaska   10.0     263       48 44.5  
3    Arizona    8.1     294       80 31.0  
4   Arkansas    8.8     190       50 19.5  
5 California    9.0     276       91 40.6  
6   Colorado    7.9     204       78 38.7  
distance <- get_dist(df)  
Warning message:  
In stats::dist(x, method = method, ...) : NAs introduced by coercion  
fviz_dist(distance, gradient = list(low = "#33E3FF", mid = "white", high = 
"#80FF33"))  

如何解决这个问题?或者如何导入fviz_dist的Excel数据?

编辑: 突出显示我如何导入和导出数据到excel:

write.xlsx(df, "c:/my_path/USArrests.xlsx")  
df <- read.xlsx(file = "c:/my_path/USArrests.xlsx", sheetIndex = 1)

1 个答案:

答案 0 :(得分:0)

虽然有点晚了但希望你现在必须解决这个问题。只是想在这里分享我的经验,以便有人能得到一些帮助。

1. Check your excel file if the data is exported properly. 
2. I prefer saving it as a csv (Comma Separated file) 
3. I assume that you are using Windows OS as you are trying to save in xslx file format. But do make sure to write delimiter as '/' or ',' at the time of writing the file.

希望它能解决问题。

祝你好运