将特定字符分配给数据框中的行

时间:2019-03-01 12:53:13

标签: r

我有一个巨大的数据框,在R中看起来像这样

    scan_id       sample
1  s8w_00001.sed      1
2  s8w_00001.sed      1
3  s9w_00001.sed      1
4 s10w_00001.sed      1
5 s11d_00002.sed      1
6 s12w_00004.sed      1
7 s13w_00001.sed      1
8 s14w_00001.sed      1

标记为sample的列应具有与标记为scan_id的列相对应的值。因此,对于观察到我具有scan_id = s8w_00001.sed的示例,样本应为8。因为该行字符中有8。我应该有这样的东西。

 scan_id          sample
1  s8w_00001.sed      8
2  s8w_00002.sed      8
3  s9w_00001.sed      9
4 s10w_00001.sed     10
5 s11d_00002.sed     11
6 s12w_00004.sed     12
7 s13w_00001.sed     13
8 s14w_00001.sed     14

有人可以帮忙吗?

4 个答案:

答案 0 :(得分:1)

您可以使用using Unity.Entities; using Unity.Rendering; using Unity.Collections; using UnityEngine; public class ECSWorld : MonoBehaviour { public GameObject boidPrefab; private static EntityManager entityManager; private static RenderMesh renderMesh; private static EntityArchetype entityArchetype; // Start is called before the first frame update void Start() { entityManager = World.Active.GetOrCreateManager<EntityManager>(); entityArchetype = entityManager.CreateArchetype(typeof(Position)); AddBoids(); } void AddBoids() { int amount = 200; NativeArray<Entity> entities = new NativeArray<Entity>(amount, Allocator.Temp); entityManager.Instantiate(boidPrefab, entities); for (int i = 0; i < amount; i++) { // Do stuff, like setting data... entityManager.SetComponentData(entities[i], new Position { Value = Vector3.zero }); } entities.Dispose(); } } ,即

gsub

答案 1 :(得分:1)

如果您只想从scan_id列中提取第一位数字,则可以使用mutate(data, sample = str_extract(scan_id, "[:digit:]+"))中的tidyverse。在这种情况下,将提取第一组数字。

如果要在数字前指定模式,请使用mutate(data, sample = str_extract(scan_id, "(?<=[:alpha:]+)[:digit:]+"))。在这种情况下,将提取前面一组字母的第一组数字。

答案 2 :(得分:0)

一种选择是使用stri_extract_first_regex包中的stringi

library(stringi)
# Extract the one occurance of a digit [0-9]+ ('+' matches 1 or more digits)
df$samples <- stri_extract_first_regex(df$scan_id, "[0-9]+")

和输出

> df
         scan_id sample samples
1  s8w_00001.sed      1       8
2  s8w_00001.sed      1       8
3  s9w_00001.sed      1       9
4 s10w_00001.sed      1      10
5 s11d_00002.sed      1      11
6 s12w_00004.sed      1      12
7 s13w_00001.sed      1      13
8 s14w_00001.sed      1      14

df在哪里:

df <- read.table(text = "scan_id       sample
  s8w_00001.sed      1
  s8w_00001.sed      1
  s9w_00001.sed      1
 s10w_00001.sed      1
 s11d_00002.sed      1
 s12w_00004.sed      1
 s13w_00001.sed      1
 s14w_00001.sed      1", header = TRUE)

答案 3 :(得分:0)

您也可以这样做:

df$sample <- gsub("\\D", "", sapply(strsplit(df$scan_id, "_"), function(x) x[1]))

         scan_id sample
1  s8w_00001.sed      8
2  s8w_00001.sed      8
3  s9w_00001.sed      9
4 s10w_00001.sed     10
5 s11d_00002.sed     11
6 s12w_00004.sed     12
7 s13w_00001.sed     13
8 s14w_00001.sed     14

此处将_上的“ scan_id”分割开,然后从分割后的第一个元素中提取数字。