Question

我有数据（较大的一部分）看起来像这样:(由于整个数据框的设置方式，每个元音不能有多行）

info.df <- data.frame(
    vowelFormantF2_90 = c(1117, 1433, 2392), 
    vowelFormantF3_90 = c(2820, 3062, 2670), 
    vowelFormantF2_50 = c(1016, 1313, 2241),
    vowelFormantF3_50 = c(2842, 3150, 3038),
    previousVowel = c("U", "U", "ae"))

50和90对应于时间（元音持续时间的50％点在元音持续时间的90％点之前）。

我想将时间绘制为x轴，将共振峰值（四位数）绘制为y轴。我想在列名中按F2或F3对颜色分组。 previousVowel列也很重要，因为最终我想要通过元音对数据进行子集化。我计划使用ggplot2，但我对其他绘图方法持开放态度。

我想过做这样的事情：

time <- c(50,50,50,50,50,50)
formant <- c("F2","F2","F2","F3","F3","F3")
hz <- c(info.df$vowelFormantF2_50, info.df$vowelFormantF3_50)
newdataframe.df <- data.frame(time, formant, hz)

但这似乎很麻烦，因为这个数据集增长了，也不会解释元音本身。

有没有办法按照我想要的方式格式化这些数据？

Answer 1

我会使用tidyr：

library(tidyr)
df <- info.df %>% gather(var, val, -vowel) %>%
            separate(var, into = c("formant", "time"))

将给出：

   vowel        formant time  val
1      U vowelFormantF2   90 1117
2      U vowelFormantF2   90 1433
3     ae vowelFormantF2   90 2392
4      U vowelFormantF3   90 2820
5      U vowelFormantF3   90 3062
6     ae vowelFormantF3   90 2670

您可以添加：

library(dplyr)
df %>% mutate(formant = sub("vowelFormant", "", formant))

删除vowelFormant，只有F2，F3等。

在R中重新格式化数据以获得折线图

1 个答案: