R中的泰坦尼克号数据集

时间:2017-12-07 17:07:41

标签: r

我正在努力为“泰坦尼克号”#34; R。

中的数据集

在此数据中,最后一列给出了观察的频率(' freq'列)。 例如,第三行表示频率= 35,这意味着该特定行将重复35次。

因此,我正在编码一个新数据帧,其中所有行的频率都>多次打印0次(第3行附加在新数据框中,35次)。

新数据帧中的总行数= 2201,即频率列中所有值的总和。

我使用了长度为2201的字符向量来存储第一列的所有值" Class"我在其中附加值。

我写了以下代码 -

data(Titanic)
# View(Titanic)

# create a data frame out of 'Titanic' data frame-
T <- as.data.frame(Titanic, stringsAsFactors = FALSE)

# cat("Total # of observations - ", sum(T$Freq))    # O/P = 2201
n <- sum(T$Freq)


# full_titanic <- data.frame(Class = character(n), Sex = character(n), Age = character(n), Survived = character(n), stringsAsFactors=FALSE)

full_Class <- character(n)  # create an array of 2201 character objects

for(i in 1:nrow(T))
{
    if(T$Freq[i] > 0)
    {
        cnt = T$Freq[i]
        # repeating_val <- T$Class[i]
        j <- 0

        while(j < cnt)
        {
            # full_Class[i] <- repeating_val
            full_Class[i + j] <- T$Class[i]
            # cat("T$Class[", i, "] = ", T$Class[i], "\n")
            # cat("Repeating for i = ", i, "\n")
            j <- j + 1
        }
    }
    else
    {
        full_Class[i] <- T$Class[i]
    }

    # cat("i = ", i, "\n")
}

然而,这段代码在字符向量&#39; full_Class&#39;中留下了很多空白。当我打印它时。

我能够看到差异为 -

table(full_Class) # shows the sum of all classes = 1520

sum(T$Freq[T$Class == "1st"]) # equals 325

sum(T$Freq[T$Class == "2nd"]) # equals 285

sum(T$Freq[T$Class == "3rd"]) # equals 706

sum(T$Freq[T$Class == "Crew"]) # equals 885

(325 - 67) + (285 - 11) + (706 - 63) + (885 - 540) # equalss 1520

出了什么问题?

谢谢!

1 个答案:

答案 0 :(得分:1)

这有帮助吗?

T1<-T[T$Freq==0,] # data with zero frequency
T2<-T[rep(row.names(T),T$Freq),] #data with nonzero frequency
T3<-rbind(T1,T2) #full data 
rownames(T3) <- 1:nrow(T3) #reset row index of full data

输出

> head(T3,20) Class Sex Age Survived Freq 1 1st Male Child No 0 2 2nd Male Child No 0 3 Crew Male Child No 0 4 1st Female Child No 0 5 2nd Female Child No 0 6 Crew Female Child No 0 7 Crew Male Child Yes 0 8 Crew Female Child Yes 0 9 3rd Male Child No 35 10 3rd Male Child No 35 11 3rd Male Child No 35 12 3rd Male Child No 35 13 3rd Male Child No 35 14 3rd Male Child No 35 15 3rd Male Child No 35 16 3rd Male Child No 35 17 3rd Male Child No 35 18 3rd Male Child No 35 19 3rd Male Child No 35 20 3rd Male Child No 35