如何在数据帧中查找另一个名称之间是否存在特定名称?

时间:2015-11-16 12:51:59

标签: r string dataframe

我有一个数据框,作为一个例子,你可以在下面看到它的一部分:

name=c("ENS1","ENS1","ENS1", "ENS2", "ENS2", "ENS2","ENS2", "ENS2", "ENS2")
position=c("chr1", "chr1", "chr1", "chr2", "chr2", "chr2", "chr2", "chr2", "chr2")
dif=c("mir1","ENS1","ENS1", "ENS2", "mir4", "mir2","mir3", "ENS2", "mir1")
data=data.frame(name,position,dif)

现在,我想知道两个相同mir之间是否有ENS,我想知道之前ENS mir之前有ENS1 FALSE (No mir in between) ENS2 TRUE (mir in between) 吗?

所以我需要一些代码来获得这样的结果:

import pandas as pd
import numpy as np

test_df = pd.DataFrame(np.random.randn(50,5))
print(test_df.iloc[:25,2:4])
pd.to_numeric(test_df.iloc[:25, 2:4])

1 个答案:

答案 0 :(得分:0)

indx< - unique(grep(“ENS”,dif,value = TRUE))

result = sapply(indx,function(x)diff(((x == data $ dif)))> 1)

result_final< - vector(mode =“list”,length = length(result))

for(i in 1:length(result_final))

{

mydata< - result [[i]]

mydata< - sort(mydata,减去= TRUE)

姓名< - 姓名(result [i])

if(mydata ==“TRUE”){

result_final[[i]] <- c(names, "both")

}其他{

result_final[[i]] <- c(names, "proximal")

}

}

result_final&lt; - as.data.frame(do.call(rbind,result_final))