基于数据框头的决策

时间:2015-03-18 11:33:55

标签: r dataframe conditional-statements columnname

我有一个数据框列表:

df
[[1]]
    ID SignalIntensity       SNR
1  109        6.182309 0.8453577
2  110       10.172777 4.3837078
3  111        7.292275 1.0725751
4  112        8.898467 2.3192185
5  113        9.591034 3.7133402
7  116        7.789323 1.3636656
8  117        7.194835 1.1349738
9  118        6.572773 0.9041846
11 120        9.371126 2.9968457
12 121        6.154944 0.7777584

[[2]]
    ID SignalIntensity       SNR
1  118        6.572773 0.9041846
2  119        5.377519 0.7098581
3  120        9.371126 2.9968457
4  121        6.154944 0.7777584
5  123        5.797446 0.7235425
6  124        5.573614 0.7019574
7  125        7.014537 0.3433343
8  126        6.089159 0.7971650
9  127        6.314820 0.7845944
10 131        5.342544 1.2300000

标题为ID SignalIntensitySNR。我按名称检查标题(df [[1]])。现在,在检查标题后,我需要做出决定,例如df[[1]]的标题是IDSingnalIntensitySNR,然后执行类似

的操作
    If(names(df[[1]]=="ID"))
    {
    print("This is data from Illumina platform")

    my code..........
    } 
    else if{my code...........}

这里你知道它有三个标题。

我知道我的做法是错误的,如下面的踪迹

if(names(df[[1]]=="ID, SignalIntensity, SNR")),它给了我 Error in if (names(df[[1]] == "ID, SignalIntensity, SNR")) { : argument is of length zero 这很明显。

如何设置if{}以使其匹配所有三个标头或(我们选择的标头1 r 2 r 3)并转到其他代码true,否则执行其他操作。感谢

3 个答案:

答案 0 :(得分:1)

试试这段代码:

headers <- c("ID", "SNR") # can add more header names here
hasHeader <- is.element(headers, names(df[[1]])) # c(T, T)
sumHeader <- sum(hasHeader, na.rm=T)             # 2
result <- ifelse(sumHeader==length(sumHeader), T, F)

# result is T if "ID" and "SNR" are names of df[[1]]

答案 1 :(得分:1)

如果您想直接处理代码:

wanted_colnames = c("ID","SignalIntensity","SNR")

lapply(df, function(u){
    if(any(wanted_colnames %in% names(u)))
    {
        # do something
    } else {
        # do something
    }
}) 

答案 2 :(得分:1)

扩展我的评论,试试这个:

#dummy data
df <-
  list(
    data.frame(ID=1:5,
               SignalIntensity=runif(5),
               SNR=runif(5)),
    data.frame(ID=1:3,
               x=runif(3)),
    data.frame(ID=1:5,
               SignalIntensity=runif(5),
               SNR=runif(5)))

#check 1st data frame
if(length(intersect(names(df[[1]]),c("ID","SignalIntensity","SNR")))==3){
  print("Illumina platform")} else {
    print("Non Illumina platform")}
# [1] "Illumina platform"

#check all dataframes
lapply(df,function(i)
  if(length(intersect(names(i),c("ID","SignalIntensity","SNR")))==3){
    "Illumina platform"} else {
      "Non Illumina platform"})
# [[1]]
# [1] "Illumina platform"
# 
# [[2]]
# [1] "Non Illumina platform"
# 
# [[3]]
# [1] "Illumina platform"