更新

Question

一个数据帧（df）具有多行和一列。我应该将一列转换为多列，然后将1：删除为N：。

head(df[1:3,])

[1] Q1        1     1: 0.009110   2:-0.002122   3:-0.005770   4:-0.016751   5: 0.003284   6:-0.082381              
[2] Q2        1     1: 0.018065   2:-0.033954   3:-0.033954   4: 0.005826   5:-0.033918   6:-0.034069   7:-0.030281   
[3] Q3        1     1: 0.058728   2: 0.003693   3:-0.008006   4: 0.035635   5: 0.039816   6: 0.040578              
20 Levels: Q1        1     1: 0.009110   2:-0.002122   3:-0.005770   4:-0.016751   5: 0.003284   6:-0.082381 ...

df<-read.csv("effect.txt",header = F,skip = 1)
df2 <- lapply(df, gsub, pattern="1:", replacement= "")

Answer 1

这有很长的路要走，但是行得通。

#Read the data set
df <- read.table(text = "
                 'Q1        1     1: 0.009110   2:-0.002122   3:-0.005770   4:-0.016751   5: 0.003284   6:-0.082381'              
                 'Q2        1     1: 0.018065   2:-0.033954   3:-0.033954   4: 0.005826   5:-0.033918   6:-0.034069   7:-0.030281'   
                 'Q3        1     1: 0.058728   2: 0.003693   3:-0.008006   4: 0.035635   5: 0.039816   6: 0.040578 '             
                 ",header=F)

library(tidyr)
df[,1] <- gsub("[1-9]:",";",df[,1])  #replace any one digit number i.e. [1-9]  followed by ':' with ';'
df[,1] <- gsub("Q[1-9]        1     ;","",df[,1])   #replace any Q with one digit number then space one digit number then space then ';' e.g. "Q1        1     ;", "Q2        1     ;", "Q3        1     ;", ... etc with ""

max.length <- max(sapply(strsplit(df[,1],";"),length))   #find the length of each row to predifenied the number of columns required by `separate` 
df_clean <- separate(df,1, paste0("a",1:max.length),sep = ";",fill = "right")

df_clean %>% mutate_if(is.character,as.numeric) #change all character columns to numeric

        a1        a2        a3        a4        a5        a6        a7
1 0.009110 -0.002122 -0.005770 -0.016751  0.003284 -0.082381        NA
2 0.018065 -0.033954 -0.033954  0.005826 -0.033918 -0.034069 -0.030281
3 0.058728  0.003693 -0.008006  0.035635  0.039816  0.040578        NA

更新

gsub("Q\\d{1,3}\\s+\\d{1,2}\\s+;","","Q300      29       ;")
[1] ""

Q\\d{1,3} Q后跟一个包含1-3个数字的数字，即Q1，Q12或Q123

\\s+将匹配1个或多个空格

现在您可以更新

df[,1] <- gsub("Q[1-9]        1     ;","",df[,1])

通过

df[,1] <- gsub("Q\\d{1,3}\\s+\\d{1,2}\\s+;","",df[,1])

将具有多行的一个变量转换为多列

1 个答案:

更新