Question

我正在尝试从R函数包中获得的结果中提取特定信息。我只熟悉Python，所以学习R语言让我感到困惑。

假设这是appnn的输出（打印后），预测蛋白质聚集的包

Prediction: 
$sequence
[1] "IFYFYGTTY"

$overall
[1] 1.076839

$aminoacids
          [,1]
 [1,] 1.076839
 [2,] 1.076839
 [3,] 1.076839
 [4,] 1.076839
 [5,] 1.076839
 [6,] 1.076839
 [7,] 1.028888
 [8,] 1.011057
 [9,] 1.011057

$hotspots
$hotspots[[1]]
[1] 1 9

现在我需要的只是读取最后2个数字（1和9），从中我将必须提取由这些数字确定的给定字符串的一部分（在这种情况下，它是整个字符串，因为它的长度从第1点到第9点跨越。

在python中它很简单，但我不知道如何用R来做非常感谢。

Answer 1

这是可重现的示例。我猜你正在使用appnn包并创建一个名为predictions的对象，如下例所示：

library(appnn)
sequences <- c('STVIIE','KKSSTT','KYSTVI')
predictions <- appnn(sequences)

现在文档对返回值的说法是什么：

 Value

 A list containing the amyloidogenicity propensity predictions for the
 polypeptides queried.

   overall
     The overall amyloidogenicity propensity prediction value for the sequence

   aminoacids
     The amyloidogenicity propensity prediction value per amino acid 

   hotspots
     A list of the amyloidogenic hotspots predicted in the sequence, limited by the first and last amino acid

所以在这里我查询了三个序列，所以我找回了一个包含三个元素的R列表。我可以通过选择结果的元素得到每个结果，这里是第一个：

> predictions[[1]]
$sequence
[1] "STVIIE"

$overall
[1] 0.9497568

$aminoacids
          [,1]
[1,] 0.9497568
[2,] 0.9497568
[3,] 0.9497568
[4,] 0.9497568
[5,] 0.9497568
[6,] 0.9497568

$hotspots
$hotspots[[1]]
[1] 1 6

这是另一个包含命名组件的列表。热点组件本身就是一个只包含一个组件的列表（可能因为结果可能是一个序列的多个热点）所以我可以这样得到它：

> predictions[[1]]$hotspots[[1]]
[1] 1 6

返回长度为2的R向量（向量和列表在R中略有不同），值为1和6.

从R输出中提取特定文本行

1 个答案: