R:根据条件

时间:2018-04-11 16:23:36

标签: r list

我认为这个问题很困难,这是我的水平,我想在将来自己学习如何做到这一点。如果我没有提供足够的信息或提供不清楚的信息,请告诉我。

我有一个数据框列表:

d1<-data.frame( Data0 = c("N,R,15,P,D", "_KEY_VALUE_1", -1,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25),
            Data1 = c("N,15,C,D", "Garden",0.9759,0.7121,0.7376,0.7647,0.7927,0.8209,0.8487,0.8759,0.9021,0.9274,0.9518,
                      1,1.0249,1.0514,1.0805,1.1132,1.1508,1.1946,1.2462,1.3071,1.3793,1.4649,1.5661,1.6854,1.8254,1.9887))

d2<-data.frame(
  Data0=c("N,R,2,I,D","no_flowers",-2 , 0 , 1 , 2 , 3 , 4 , 5 , 6 , 7 , 8 , 9 ,10 ,11) ,
  Data1=c("N,15,C,D","Garden",0.8891 ,0.8891,0.9051,1,0.8891,0.8891,0.7907,0.8891,0.9929,0.8891,0.8891,0.8891,0.8891)
)

d3<-data.frame(Data0=c("A,X,15,P,D","_KEY_TEXT_1","Y","N","U"),
               Data1=c("N,15,C,D","Garden",1.0834,1,1))
d4<-data.frame(
  Data0=c("A,X,15,P,D","_KEY_TEXT_1","Y","Y","Y","Y","Y","Y","N","N","N","N","N","N"),
  Data1=c("N,R,3,I,D","house_age",16,18,19,20,21,50,16,18,19,20,21,50),
  Data2=c("N,15,C,D","Garden",2.2291,2.0743,1.9369,1.8148,1.7064,1.6102,2.2291,2.0743,1.9369,1.8148,1.7064,1.6102)
)

dfl<-list(d1,d2,d3,d4)
names(dfl)<-c("no_animals","no_flowers","radiation","summer_x_house_age")

如果您看到每个数据框中第一列的第一个值,则第二个字母(在第一个逗号之后)是 R X R 代表 Ranged X 代表 not Ranged 。我想,如果这封信是&#34; R&#34; (远程),将列操作为两列,即我希望 d1 数据帧的结果如下所示:

对于 d4 数据框,&#34; 夏天&#34;之间的互动。 (是/否)和&#34; 家庭年龄&#34;,我们看到只有第二列(家庭年龄)是远程的,所以我想与 d1 相同,但夏天= Y和夏天= N.

数据框的一些背景知识,如果它让事情更容易理解:

这是我在R之外创建的glm模型的结果,我希望将其导入R.数据帧的最后一列始终是回归的beta值和列。以前是变量,有时是分类( X ),有时是连续的( R )。当它们连续/远程时,我必须操纵该列以获得&#34;来自&#34;和&#34;到&#34;,因为我想使用这个列表来计算某些数据的概率,其中我有我在glm模型中使用的回归量的值。最高的数字意味着&#34;来自&amp;不包括无限,到&amp;包括最高数字&#34; ,第二最高数字意味着&#34; 来自&amp;不包括最高数量,以及&amp;包括第二个最高号码&#34; ,依此类推。

1 个答案:

答案 0 :(得分:1)

我知道了。

定义一个新函数,查找关键字母(R或X)并返回一个新数据框(如果是R)或同一数据框(如果是X)。

Rcheck <- function(df){

  # Isolate the letter being tested for R or X
  key_letter <- substr(as.character(df[1,1]),3,3)

  if( key_letter == "R"){ # Proceed if letter is R

    # Assign new dataframe
    df_new <- df

    # Add new column. 
    df_new[,'Data0_'] <- as.character(df_new[,'Data0'])

    # Shift down and add -9999 value
    rows <- nrow(df_new)
    df_new[,'Data0_'][4:rows] <- as.character(df_new[,'Data0'][3:(rows-1)])
    df_new[,'Data0_'][3] <- "-9999"

    # Take new column from the end and put it beside Data0
    column1_name <- colnames(df_new)[1]
    new_column_name <- colnames(df_new)[ncol(df_new)]
    other_column_names <- colnames(df_new)[2:(ncol(df_new)-1)]

    df_new <- df_new[,c(column1_name, new_column_name, other_column_names)]
    df_new

  } else{ # If letter is not R
    df
  }

}

然后使用lapply将此函数应用于数据框列表。

new_list <- lapply(dfl, Rcheck)