在向数据框添加新列后接收到不正确的列名

时间:2019-11-25 15:19:06

标签: r dataframe

我有一个名为“ femalet3”的文件,该文件具有由字符串值组成的列。所有其他列都有一个数值,这就是为什么我使用以下代码:

femalet3$mean.f<-data.frame(mean.f=femalet3[,1], mean.f=rowMeans(femalet3[,-1]))

这个想法来自:Calculate row means on subset of columns

问题是,当我运行此行时,我收到以下输出:

Significance GSM1311846 GSM1311847 mean.f.mean.f mean.f.mean.f.1
Vsig         88.35497   83.16820          VSig        85.40076

问题是我有几个“ mean.f”,“重要性”的值复制到了mean.f列中。我做了colnames(femalet3),输出是:

"Significance" "GSM1311840" "GSM1311841" "GSM1311842" "GSM1311843"        "GSM1311844" "GSM1311845"  "GSM1311846"  "GSM1311847"  "mean.f"    

尽管输出较早,但显然只有一个“ mean.f”。我认为我没有正确使用从其他问答中获取的代码行,这可能会导致格式错误。所需的输出是:

          Significance GSM1311846 GSM1311847 mean.f
          Vsig         88.35497   83.16820    85.40076

1 个答案:

答案 0 :(得分:1)

您在以下代码行中遇到了问题:

femalet3 $ mean.f <-data.frame(mean.f = femalet3 [,1],mean.f = rowMeans(femalet3 [,-1]))

femalet3是开头的data.frame。如果您尝试为列分配另一个数据框,它会为您提供一些奇怪的结构。

我在下面模拟您的数据集以显示错误发生的位置:

femalet3 <- data.frame(Significance = letters[1:10],matrix(rnorm(80),ncol=8))
colnames(femalet3)[-1] = c("GSM1311840","GSM1311841","GSM1311842",
"GSM1311843","GSM1311844","GSM1311845","GSM1311846","GSM1311847")
femalet3$mean.f<-data.frame(mean.f=femalet3[,1], mean.f=rowMeans(femalet3[,-1]))

head(femalet3)
  Significance  GSM1311840 GSM1311841  GSM1311842  GSM1311843 GSM1311844
1            a -0.09282641  0.0753268 -0.04400652  0.02442526  0.3065423
2            b  1.14718259  0.6062297 -0.08556210  0.15121682  1.6412273
3            c -1.45645947 -1.6808505 -1.93452662 -0.06121562  1.9080640
4            d  0.03955011  1.5496713 -0.27779819 -0.69083631  0.8331726
5            e -0.61881124  1.2798835 -0.55046474 -0.61394703  2.3530607
6            f  1.77918616  0.5156059  0.37311045  1.77081855 -0.8689152
  GSM1311845 GSM1311846 GSM1311847 mean.f.mean.f mean.f.mean.f.1
1  1.1210784  0.6891616  0.7314997             a       0.3514002
2  1.8341236  3.0722572  0.9026674             b       1.1586678
3 -0.5721591  2.8964295 -2.0082267             c      -0.3636181
4  1.1212192  0.2129126  0.9595494             d       0.4684301
5 -0.6253303  1.0512457 -1.2166623             e       0.1323718
6  0.4963209 -0.5864916  0.4429023             f       0.4903172

这会将data.frame嵌入到data.frame的mean.f列中:

ncol(femalet3)
10

head(femalet3$mean.f)
   mean.f   mean.f.1
1       a  0.3514002
2       b  1.1586678
3       c -0.3636181
4       d  0.4684301
5       e  0.1323718

我们删除了前一个奇怪的列:

femalet3$mean.f <- NULL

要避免这种情况,您只需要:

femalet3$mean.f<-rowMeans(femalet3[,-1])
head(femalet3)

> head(femalet3)
  Significance  GSM1311840 GSM1311841  GSM1311842  GSM1311843 GSM1311844
1            a -0.09282641  0.0753268 -0.04400652  0.02442526  0.3065423
2            b  1.14718259  0.6062297 -0.08556210  0.15121682  1.6412273
3            c -1.45645947 -1.6808505 -1.93452662 -0.06121562  1.9080640
4            d  0.03955011  1.5496713 -0.27779819 -0.69083631  0.8331726
5            e -0.61881124  1.2798835 -0.55046474 -0.61394703  2.3530607
6            f  1.77918616  0.5156059  0.37311045  1.77081855 -0.8689152
  GSM1311845 GSM1311846 GSM1311847     mean.f
1  1.1210784  0.6891616  0.7314997  0.3514002
2  1.8341236  3.0722572  0.9026674  1.1586678
3 -0.5721591  2.8964295 -2.0082267 -0.3636181
4  1.1212192  0.2129126  0.9595494  0.4684301
5 -0.6253303  1.0512457 -1.2166623  0.1323718
6  0.4963209 -0.5864916  0.4429023  0.4903172