重启后厨师客户端无法恢复

时间:2016-04-17 01:58:40

标签: chef chef-windows

我运行了以下食谱

O:\chef\cookbooks\wincfg>chef-client -L C:\chef\rds_deployment.log -l info -z -o wincfg::rds_deployment

安装Windows功能后服务器按预期重新启动

我看到我的日志文件的最后几行说:

[2016-04-17T01:43:51+00:00] INFO: powershell_script[Desktop-Experience] ran successfully
[2016-04-17T01:43:51+00:00] INFO: powershell_script[Desktop-Experience] sending reboot_now action to reboot[reboot] (immediate)
[2016-04-17T01:43:51+00:00] INFO: Processing reboot[reboot] action reboot_now (wincfg::rds_deployment line 6)
[2016-04-17T01:43:51+00:00] WARN: Rebooting system immediately, requested by 'reboot'
[2016-04-17T01:43:51+00:00] INFO: Changing reboot status from {} to {:delay_mins=>0, :reason=>"There is a pending reboot.", :timestamp=>2016-04-17 01:43:51 +0000, :requested_by=>"reboot"}
[2016-04-17T01:43:51+00:00] WARN: Skipping final node save because override_runlist was given
[2016-04-17T01:43:51+00:00] INFO: Chef Run complete in 90.479509 seconds
[2016-04-17T01:43:51+00:00] INFO: Skipping removal of unused files from the cache
[2016-04-17T01:43:51+00:00] INFO: Running report handlers
[2016-04-17T01:43:51+00:00] INFO: Report handlers complete
[2016-04-17T01:43:51+00:00] WARN: Rebooting server at a recipe's request. Details: {:delay_mins=>0, :reason=>"There is a pending reboot.", :timestamp=>2016-04-17 01:43:51 +0000, :requested_by=>"reboot"}

有问题的食谱部分是:

reboot "reboot" do
  action :nothing
  reason 'There is a pending reboot.'
  only_if { reboot_pending? }
end

%w{ Desktop-Experience 
  Remote-Desktop-Services 
  RDS-RD-Server 
  RDS-Connection-Broker 
  RDS-Web-Access 
  RDS-Licensing 
  RDS-Gateway }.each do |feature|
  powershell_script "#{feature}" do
    code <<-EOH
    Import-Module ServerManager
    Add-WindowsFeature #{feature}
    EOH
    not_if "Import-Module ServerManager; (Get-WindowsFeature -Name #{feature}).Installed -eq $true"
    notifies :reboot_now, 'reboot[reboot]', :immediately
  end
end

我希望配方中的每个功能都可以使用Add-WindowsFeature进行安装(如果尚未安装),如果reboot_pending为true,则立即重启。

似乎重新启动正在进行,但随后配方没有使用下一个功能(在桌面体验之后)。

更新: 以下是我安装Chef(在一个全新的开箱即用EC2映像上运行Server 2012 R2 Base),Chef Windows服务和Chef DK的方法:

powershell -NoProfile -ExecutionPolicy Bypass ". { iwr -useb https://omnitruck.chef.io/install.ps1 } | iex; install; cd C:\opscode\chef\bin\; cmd /c chef-service-manager -a install; cmd /c chef-service-manager -a start"

powershell -NoProfile -ExecutionPolicy Bypass ". { iwr -useb https://omnitruck.chef.io/install.ps1 } | iex; install -project chefdk"

安装后,我立即运行

net use O: \\fileserver\share
O:
cd chef\cookbooks\wincfg
berks vendor ..\..\cookbooks
chef-client -L C:\chef\rds_deployment.log -l info -z -o wincfg::rds_deployment

更新2:

我看到了     [2016-04-17T01:43:51 + 00:00]警告:跳过最终节点保存,因为给出了override_runlist

日志中的

...所以不是用-o指定运行列表,而是使用-r指定它。此警告不再出现在日志中(我在nodes \ thehost.json中看到TON更多信息)...但在重新启动后仍然无法恢复:(

重启后,我在Application Event Viewer中看到以下内容:

Failed Chef Client run UNKNOWN in UNKNOWN seconds.
 Exception type: Chef::Exceptions::PrivateKeyMissing
 Exception message: I cannot read C:\chef\validation.pem, which you told me to use to sign requests!
 Exception backtrace: C:/opscode/chef/embedded/lib/ruby/gems/2.1.0/gems/chef-12.9.38-universal-mingw32/lib/chef/http/authenticator.rb:86:in `rescue in load_signing_key'
C:/opscode/chef/embedded/lib/ruby/gems/2.1.0/gems/chef-12.9.38-universal-mingw32/lib/chef/http/authenticator.rb:76:in `load_signing_key'

我喜欢通过(缺乏)文档进行一次很好的冒险。

我几乎让它运转

  • 确保chef_repo路径始终可用(不是网络驱动器)
  • 在C:\ chef \中创建一个client.rb文件,指示运行chef-client始终处于零客户端模式(不只是在我从命令行手动调用时)

所以,我的新工件看起来像

C:\厨师\ client.rb

log_level :info
log_location 'C:\chef\client.log'
chef_server_url 'https://localhost:4000'
validation_client_name 'chef-validator'
chef_zero.enabled true
chef_zero.port 4000
local_mode true
cookbook_path ['C:\chef_repo\cookbooks']

\ ops01 \ OPS \厨师\ bootstrap.bat:

mklink C:\chef_repo %~dp0 /d
powershell -NoProfile -ExecutionPolicy Bypass ". { iwr -useb https://omnitruck.chef.io/install.ps1 } | iex; install"
C:
cd \opscode\chef\bin\
copy %~dp0client.rb C:\chef\ /y
call chef-service-manager -a install
call chef-service-manager -a start

关键部分是引导client.rb并确保链接始终可用,因为client.rb不支持unc / smb路径。

厨师 - 客户端Windows服务现在似乎在重新启动时会自动正确地运行....但是当它发生时,它不会触发重启本身。而是记录

[2016-04-18T02:38:24+00:00] INFO: Changing reboot status from {} to {:delay_mins=>0, :reason=>"There is a pending reboot for \#{pack}.", :timestamp=>2016-04-18 02:38:24 +0000, :requested_by=>"googlechrome_reboot"}
[2016-04-18T02:38:24+00:00] INFO: HTTP Request Returned 500 Internal Server Error: error
[2016-04-18T02:38:24+00:00] ERROR: Running exception handlers
[2016-04-18T02:38:24+00:00] ERROR: Exception handlers complete
[2016-04-18T02:38:24+00:00] FATAL: Stacktrace dumped to c:/chef/local-mode-cache/cache/chef-stacktrace.out
[2016-04-18T02:38:24+00:00] FATAL: Please provide the contents of the stacktrace.out file if you file a bug report
[2016-04-18T02:38:24+00:00] FATAL: Net::HTTPFatalError: 500 "Internal Server Error"
[2016-04-18T02:38:37+00:00] INFO: Child process exited (pid: 692)
[2016-04-18T02:38:38+00:00] INFO: Next chef-client run will happen in 1800.8035677517687 seconds

所以......看起来零客户端服务器正在返回http 500错误。事件查看器应用程序日志显示:

Failed Chef Client run af972109-32ca-4089-97ef-789b7b5d8d07 in 133.762612 seconds.
 Exception type: Net::HTTPFatalError
 Exception message: 500 "Internal Server Error"
 Exception backtrace: C:/opscode/chef/embedded/lib/ruby/2.1.0/net/http/response.rb:119:in `error!'
C:/opscode/chef/embedded/lib/ruby/gems/2.1.0/gems/chef-12.9.38-universal-mingw32/lib/chef/http.rb:146:in `request'
C:/opscode/chef/embedded/lib/ruby/gems/2.1.0/gems/chef-12.9.38-universal-mingw32/lib/chef/http.rb:119:in `put'
C:/opscode/chef/embedded/lib/ruby/gems/2.1.0/gems/chef-12.9.38-universal-mingw32/lib/chef/node.rb:620:in `save'
C:/opscode/chef/embedded/lib/ruby/gems/2.1.0/gems/chef-12.9.38-universal-mingw32/lib/chef/client.rb:542:in `save_updated_node'
C:/opscode/chef/embedded/lib/ruby/gems/2.1.0/gems/chef-12.9.38-universal-mingw32/lib/chef/client.rb:704:in `converge_and_save'
C:/opscode/chef/embedded/lib/ruby/gems/2.1.0/gems/chef-12.9.38-universal-mingw32/lib/chef/client.rb:281:in `run'
C:/opscode/chef/embedded/lib/ruby/gems/2.1.0/gems/chef-12.9.38-universal-mingw32/lib/chef/application.rb:267:in `run_with_graceful_exit_option'
C:/opscode/chef/embedded/lib/ruby/gems/2.1.0/gems/chef-12.9.38-universal-mingw32/lib/chef/application.rb:243:in `block in run_chef_client'
C:/opscode/chef/embedded/lib/ruby/gems/2.1.0/gems/chef-12.9.38-universal-mingw32/lib/chef/local_mode.rb:44:in `with_server_connectivity'
C:/opscode/chef/embedded/lib/ruby/gems/2.1.0/gems/chef-12.9.38-universal-mingw32/lib/chef/application.rb:226:in `run_chef_client'
C:/opscode/chef/embedded/lib/ruby/gems/2.1.0/gems/chef-12.9.38-universal-mingw32/lib/chef/application/client.rb:419:in `run_application'
C:/opscode/chef/embedded/lib/ruby/gems/2.1.0/gems/chef-12.9.38-universal-mingw32/lib/chef/application.rb:58:in `run'
C:/opscode/chef/embedded/lib/ruby/gems/2.1.0/gems/chef-12.9.38-universal-mingw32/bin/chef-client:26:in `<top (required)>'
C:/opscode/chef/bin/chef-client:61:in `load'
C:/opscode/chef/bin/chef-client:61:in `<main>'

这对我来说并没有真正表明任何意义......

但是如果我去命令行并且只运行chef-client(从任何目录,没有参数,它立即识别需要重新启动,并且这样做)。

有什么想法可以解决这个问题吗?真的很感激。

1 个答案:

答案 0 :(得分:1)

除非你设置Chef作为服务运行或通过计划任务运行的东西,否则它不能在重启后自行重新运行。此外,厨师本身并没有“从它停止的地方继续”,但它通常是幂等的,只会改变需要改变的事物。资源上的not_if守卫是每个事物的幂等性检查。您是否有理由不使用windows_feature资源?