Question

我正在尝试将R中的CRAN群集包中的data.frame转换为daisy矩阵。我有一个包含13个分类变量的13109个观测数据集。

我有两种类型的错误，关于NA被强制引入而且没有错过最小/最大的参数。为什么我会收到此错误？

我NA中没有任何data.frame个值。以下是我的数据集的信息：

> str(df4)
'data.frame':   13109 obs. of  9 variables:
 $ Age               : chr  "55-64" "55-64" "55-64" "55-64" ...
 $ Gender            : chr  "Female" "Female" "Male" "Male" ...
 $ HouseholdIncome   : chr  "50k-75k" "150k-175k" "150k-175k" "150k-175k" ...
 $ MaritalStatus     : chr  "Single" "Married" "Married" "Married" ...
 $ PresenceofChildren: chr  "No" "Yes" "Yes" "Yes" ...
 $ HomeOwnerStatus   : chr  "Own" "Rent" "Rent" "Rent" ...
 $ HomeMarketValue   : chr  "350k-500k" "500k-1mm" "500k-1mm" "500k-1mm" ...
 $ Occupation        : chr  "White Collar Worker" "Professional" "Professional" "Professional" ...
 $ Education         : chr  "Completed High School" "Completed College" "Completed College" "Completed College" ...

以下是强制执行NA值的PAM值的证据：我尝试执行NA群集功能，但收到的错误是>library(cluster) >#Create dissimilarity matrix >#Gower coefficient for finding distance between mixed variable >daisy4 <- daisy(df4, metric = "gower", type = list(ordratio = c(1:9))) > warnings() Warning messages: 1: In data.matrix(x) : NAs introduced by coercion 2: In data.matrix(x) : NAs introduced by coercion 3: In data.matrix(x) : NAs introduced by coercion 4: In data.matrix(x) : NAs introduced by coercion 5: In data.matrix(x) : NAs introduced by coercion 6: In data.matrix(x) : NAs introduced by coercion 7: In data.matrix(x) : NAs introduced by coercion 8: In data.matrix(x) : NAs introduced by coercion 9: In data.matrix(x) : NAs introduced by coercion 10: In min(x) : no non-missing arguments to min; returning Inf 11: In max(x) : no non-missing arguments to max; returning -Inf 12: In min(x) : no non-missing arguments to min; returning Inf 13: In max(x) : no non-missing arguments to max; returning -Inf 14: In min(x) : no non-missing arguments to min; returning Inf 15: In max(x) : no non-missing arguments to max; returning -Inf 16: In min(x) : no non-missing arguments to min; returning Inf 17: In max(x) : no non-missing arguments to max; returning -Inf 18: In min(x) : no non-missing arguments to min; returning Inf 19: In max(x) : no non-missing arguments to max; returning -Inf 20: In min(x) : no non-missing arguments to min; returning Inf 21: In max(x) : no non-missing arguments to max; returning -Inf 22: In min(x) : no non-missing arguments to min; returning Inf 23: In max(x) : no non-missing arguments to max; returning -Inf 24: In min(x) : no non-missing arguments to min; returning Inf 25: In max(x) : no non-missing arguments to max; returning -Inf 26: In min(x) : no non-missing arguments to min; returning Inf 27: In max(x) : no non-missing arguments to max; returning -Inf 28: In min(x) : no non-missing arguments to min; returning Inf 29: In max(x) : no non-missing arguments to max; returning -Inf > k4answers <- pam(daisy4, 3, diss = TRUE) Error in pam(daisy4, 3, diss = TRUE) : NA values in the dissimilarity matrix not allowed.值不允许。

.csv

如果我能提供更多信息，请告诉我。

编辑：我解决了我的错误。我在character文件中读作#Load Data Store4 <- read.csv("/Users/scdavis/Documents/Work/Data/Client4.csv", na.strings = "", stringsAsFactors=FALSE, head = TRUE)。这就是它与其他数据集一起工作的原因。这是我出错的地方：

#Load Data
    Store4 <- read.csv("/Users/scdavis/Documents/Work/Data/Client4.csv", 
                       na.strings = "", head = TRUE)

解决方案：

{{1}}

Answer 1

以因子变量而不是字符的形式读取数据。

#Load Data
    Store4 <- read.csv("/Users/scdavis/Documents/Work/Data/Client4.csv", 
                       na.strings = "", head = TRUE)

之前我有过这个解决方案并且创建了一个错误。

#Load Data
Store4 <- read.csv("/Users/scdavis/Documents/Work/Data/Client4.csv", 
                   na.strings = "", stringsAsFactors=FALSE, head = TRUE)

菊花功能警告消息：强制引入的NA

1 个答案: