我有两个数据框
df1 = MEEPQSDPSVEPPLSQETFSDLWK
df1<- structure(list(V1 = structure(1L, .Label = "MEEPQSDPSVEPPLSQETFSDLWK", class = "factor")), .Names = "V1", class = "data.frame", row.names = c(NA,
-1L))
df2 = NKMFCQLAKTCPVQLWVDSTPPPGTRVRAMAIY
df2 <- structure(list(V1 = structure(1L, .Label = "NKMFCQLAKTCPVQLWVDSTPPPGTRVRAMAIY", class = "factor")), .Names = "V1", class = "data.frame", row.names = c(NA,
-1L))
df1是24个字母,df2是31个字母。
我想在x轴上制作一个大的图,df1和y轴中的字母数与df2一样大。
我想像这样检查并绘制点。
MEEPQSDPSVEPPLSQETFSDLWK
NKMFCQLAKTCPVQLWVDSTPPPGTRVRAMAIY
用df2的正面字母检查df1的每个字母,如果有相似的字母然后绘制一个点,如果没有,那么这个例子中没有任何东西在df1和df2中的P和P是相似的,所以我只有一个点在x轴11和11轴y轴的位置。
答案 0 :(得分:4)
v1 <- strsplit(as.character(df1$V1),'')[[1L]];
v2 <- strsplit(as.character(df2$V1),'')[[1L]];
xlim <- c(0,length(v1));
ylim <- c(0,length(v2));
xticks <- seq(xlim[1L],xlim[2L],1);
yticks <- seq(ylim[1L],ylim[2L],1);
plot(NA,xlim=xlim,ylim=ylim,xlab='df1',ylab='df2',axes=F,xaxs='i',yaxs='i');
abline(v=xticks,col='lightgrey');
abline(h=yticks,col='lightgrey');
axis(1L,xticks,cex.axis=0.7);
axis(2L,yticks,las=2L,cex.axis=0.7);
i <- which(v1==v2)-1;
## Warning message:
## In v1 == v2 :
## longer object length is not a multiple of shorter object length
points(i,i);
如果您想要匹配更多相邻字符,则可以将i <- which(v1==v2)-1
行替换为:
cmp <- v1==v2;
len <- length(cmp)-2L;
i <- which(cmp[seq(1L,len=len)]&cmp[seq(2L,len=len)]&cmp[seq(3L,len=len)]);
或者,对于任何N
:
N <- 3L; len <- length(cmp)-N+1L;
rowSums(sapply(seq_len(N),function(i) cmp[seq(i,len=len)]))==N;
替代使用Reduce()
:
N <- 3L; len <- length(cmp)-N+1L;
Reduce(`&`,as.data.frame(sapply(seq_len(N),function(i) cmp[seq(i,len=len)])));