我是R.的新手 我需要重新格式化以下数据框:
`Sample Name` `Target Name` 'CT values'
<chr> <chr> <dbl>
1 Sample 1 actin 19.69928
2 Sample 1 Ho-1 27.71864
3 Sample 1 Nrf-2 26.00012
9 Sample 9 Ho-1 25.31180
10 Sample 9 Nrf-2 26.41421
11 Sample 9 C3 26.16980
...
15 Sample 1 actin 19.49202
实际上,我希望拥有不同的目标名称&#39;作为列名,以及各个“样本名称”。作为行名称。然后该表应显示相应的CT值。 但请注意,存在重复,例如,样本1存在两次,作为相应的目标名称,例如, &#34;肌动蛋白&#34;确实。我想要的是,该表后面只显示这些重复一次,具有两个不同的CT值。
我想这是一个非常基本的R数据帧操作,但正如我所说的,我对R很新,并且不同地使用不同的教程。
非常感谢你!
答案 0 :(得分:1)
使用tidyverse
包生态系统实现这一目标的一种方法:
library(tidyverse)
tab <- tribble(
~`Sample Name`, ~`Target Name`, ~ `CT values`,
"Sample 1", "actin", 19.69928,
"Sample 1", "Ho-1", 27.71864,
"Sample 1", "Nrf-2", 26.00012,
"Sample 9", "Ho-1", 25.31180,
"Sample 9", "Nrf-2", 26.41421,
"Sample 9", "C3", 26.16980,
"Sample 1", "actin", 19.49202
)
tab %>%
# calculate the mean of your dpulicate
group_by(`Sample Name`, `Target Name`) %>%
summarise(`CT values` = mean(`CT values`)) %>%
# reshape the data
spread(`Target Name`, `CT values`)
#> # A tibble: 2 x 5
#> # Groups: Sample Name [2]
#> `Sample Name` actin C3 `Ho-1` `Nrf-2`
#> * <chr> <dbl> <dbl> <dbl> <dbl>
#> 1 Sample 1 19.6 NA 27.7 26.0
#> 2 Sample 9 NA 26.2 25.3 26.4
你也可以使用data.table来更简单地使用它 dcast重塑功能
library(data.table)
#>
#> Attachement du package : 'data.table'
#> The following objects are masked from 'package:dplyr':
#>
#> between, first, last
#> The following object is masked from 'package:purrr':
#>
#> transpose
setDT(tab)
dcast(tab, `Sample Name` ~ `Target Name`, fun.aggregate = mean)
#> Using 'CT values' as value column. Use 'value.var' to override
#> Sample Name C3 Ho-1 Nrf-2 actin
#> 1: Sample 1 NaN 27.71864 26.00012 19.59565
#> 2: Sample 9 26.1698 25.31180 26.41421 NaN
由reprex package(v0.1.1.9000)于2018-01-13创建。