使用lapply重新编码变量并粘贴索引号

时间:2016-03-19 23:06:41

标签: r

我试图围绕如何使用d2 <- lapply(d1, FUN=function(X) recode(X, "'Somewhat interested' ='Somewhat'; 'Not interested' = 'No'"))重新编码多个变量,同时将变量名的最后一个值粘贴到字符串中。

this post的基础上,我知道我可以一次重新编码几个变量:

var_1

但是,我需要做的是略有不同。假设我的数据框具有顺序标记的变量,例如var_2var_3 var_1 var_2 var_3 var_4 1: 2: Somewhat interested Somewhat interested Somewhat interested Not interested 3: Somewhat interested Somewhat interested Somewhat interested Not interested 4: Not interested Somewhat interested Somewhat interested Somewhat interested ,看起来像这样:

           var_1              var_2              var_3             var_4
1:                                                                            
2:         Somewhat 1         Somewhat 2         Somewhat 3               No 4
3:         Somewhat 1         Somewhat 2         Somewhat 3               No 4
4:               No 1         Somewhat 2         Somewhat 3         Somewhat 4

我想重新编码变量并附加列名的顺序标识符:

recode

关于如何将paste和{{1}}合并在一起的想法?

2 个答案:

答案 0 :(得分:1)

您可以将列名称本身用于 IF(Get-Command Get-SCOMAlert -ErrorAction SilentlyContinue){}ELSE{Import-Module OperationsManager} "Get Pend reboot servers from prod" New-SCOMManagementGroupConnection -ComputerName ProdServer1 $AlertData = get-SCOMAlert -Criteria "Severity = 1 AND ResolutionState < 254 AND Name = 'Pending Reboot'" | Select NetbiosComputerName "Get Pend reboot servers from test" #For test information New-SCOMManagementGroupConnection -ComputerName TestServer1 $AlertData += Get-SCOMAlert -Criteria "Severity = 1 AND ResolutionState < 254 AND Name = 'Pending Reboot'" | Select NetbiosComputerName "Remove duplicates" $AlertDataNoDupe = $AlertData | Sort NetbiosComputerName -Unique $scriptblock = { Param([string]$server) $csv = Import-Csv D:\Scripts\MaintenanceWindow2.csv $window = $csv | where {$_.Computername -eq "$server"} | % CollectionName $SCCMWindow = IF ($window){$window}ELSE{"NoDeadline"} $PingCheck = Test-Connection -Count 1 $server -Quiet -ErrorAction SilentlyContinue IF($PingCheck){$PingResults = "Alive"} ELSE{$PingResults = "Dead"} Try{$operatingSystem = Get-WmiObject Win32_OperatingSystem -ComputerName $server -ErrorAction Stop $LastReboot = [Management.ManagementDateTimeConverter]::ToDateTime($operatingSystem.LastBootUpTime) $LastReboot.DateTime} Catch{$LastReboot = "Access Denied!"} #create custom object as output for CSV. [PSCustomObject]@{ Server=$server MaintenanceWindow=$SCCMWindow Ping=$PingResults LastReboot=$LastReboot }#end custom object }#script block end $RunspacePool = [RunspaceFactory]::CreateRunspacePool(100,100) $RunspacePool.Open() $Jobs = foreach ( $item in $AlertDataNoDupe ) { $Job = [powershell]::Create(). AddScript($ScriptBlock). AddArgument($item.NetbiosComputerName) $Job.RunspacePool = $RunspacePool [PSCustomObject]@{ Pipe = $Job Result = $Job.BeginInvoke() } } Write-Host 'Working..' -NoNewline Do { Write-Host '.' -NoNewline Start-Sleep -Milliseconds 500 } While ( $Jobs.Result.IsCompleted -contains $false) Write-Host ' Done! Writing output file.' Write-host "Output file is d:\scripts\runspacetest4.csv" $(ForEach ($Job in $Jobs) { $Job.Pipe.EndInvoke($Job.Result) }) | Export-Csv d:\scripts\runspacetest4.csv -NoTypeInformation $RunspacePool.Close() $RunspacePool.Dispose() (而不是sapply() - 我必须手动重新制作数据,因此这适用于我的版本。

所以

lapply()

变成

d2 <- lapply(d1, FUN=function(X) recode(X, "'Somewhat interested' ='Somewhat'; 'Not interested' = 'No'"))

其中d2 <- sapply(colnames(d1), FUN=function(X) recode(d1[,X], "'Somewhat interested' ='Somewhat'; 'Not interested' = 'No'")) 正在调用要将该函数应用于的列。

现在要附加我们可以使用d1[,X]

的列后缀
paste0()

替换为

"'Somewhat interested' ='Somewhat'; 'Not interested' = 'No'" 

但是这个stil并没有完全符合你的要求,因为你将拥有后缀和前缀。

这意味着我们需要删除前缀,我们可以使用paste0("'Somewhat interested' ='Somewhat ",X ,"'; 'Not interested' = 'No ", X,"'")

substr()

现在一起:

substr(X, 5, nchar(X))

答案 1 :(得分:0)

你可以使用正则表达式:

mtx1 <- sapply(seq_along(df), function(x){gsub('interested', x, df[,x])})
mtx1
#      [,1]         [,2]         [,3]         [,4]        
# [1,] "Somewhat 1" "Somewhat 2" "Somewhat 3" "Not 4"     
# [2,] "Somewhat 1" "Somewhat 2" "Somewhat 3" "Not 4"     
# [3,] "Not 1"      "Somewhat 2" "Somewhat 3" "Somewhat 4"

不可否认,它离开了#34; Not&#34;而不是&#34; No&#34;,但您可以使用更复杂的正则表达式,或者只是单独更改它:

apply(mtx1, 2, function(x){gsub('Not', 'No', x)})
#      [,1]         [,2]         [,3]         [,4]        
# [1,] "Somewhat 1" "Somewhat 2" "Somewhat 3" "No 4"      
# [2,] "Somewhat 1" "Somewhat 2" "Somewhat 3" "No 4"      
# [3,] "No 1"       "Somewhat 2" "Somewhat 3" "Somewhat 4"

如果您需要data.frames而不是矩阵,请使用as.data.frame(或您喜欢的版本)换行。

请注意,如果您的数据是因子,那么在级别而不是实际数据上运行相同的正则表达式会更有效。