Question

我应该计算chromosome.txt文件中有多少个A字母：http://users.utu.fi/jjahol/chromosome.txt

到目前为止，我已成功编写代码：

cromo <- read.table("http://users.utu.fi/jjahol/chromosome.txt", header=FALSE)
cromo2 <- as.character(unlist(cromo))

此代码创建一个1000个元素的向量，其元素长度为60个字符。如何将其转换为向量，其中一个元素等于一个字符？

Answer 1

这是一种有点非常规的方法（无论如何unlist(strsplit(...))会非常快），但您可以使用提供矢量化搜索模式选项的字符串搜索包之一，例如“stringi”：

## Read the data in. Since it's not a data.frame, just use readLines
X <- readLines("http://users.utu.fi/jjahol/chromosome.txt")

## Paste the lines together into a single block of text
Y <- paste(X, collapse = "")

library(stringi)
Strings <- c("A", "C", "G", "T")
stri_count_fixed(Y, Strings)
# [1] 15520 13843 14215 16422

## Named output....
setNames(stri_count_fixed(Y, Strings), Strings)
#     A     C     G     T 
# 15520 13843 14215 16422

Answer 2

这应该会给你想要的结果：

cromo <- read.table("http://users.utu.fi/jjahol/chromosome.txt", header=FALSE)
cromo2 <- unlist(strsplit(as.character(cromo$V1),""))
table(cromo2)

这给了你：

    A     C     G     T 
15520 13843 14215 16422

Answer 3

strsplit这样做：

> strsplit('text', '')
[[1]]
[1] "t" "e" "x" "t"

R染色体频率计算

3 个答案: