我有一个非常简单的数据框。有2列,day_created(int,可以更改为datetime)和suspened(int,可以更改为boolean)。我可以更改数据,如果它更容易使用。
Day created Suspended
0 12 0
1 6 1
2 24 0
3 8 0
4 100 1
5 30 0
6 1 1
7 6 0
day_created列是创建帐户当天的整数(从开始日期开始),从1开始增加。悬浮柱为1悬浮液,0为无悬浮液。
我想要做的是将这些帐户分成30天或几个月的组,但是从每个bin中获取该月的帐户总数以及在该月创建的帐户被暂停的数量。然后我计划每个月创建一个条形图,其中包含2个条形图。
我应该怎么做?我不经常使用大熊猫。我想我需要做一些重新抽样和计数的技巧。
答案 0 :(得分:1)
使用
return "#{@@result.to_json}"
为DataFrame提供表示帐户创建时间的时间戳索引。
然后你可以使用
# require 'em-http-request'
class WaitForJob
def self.job(current_build_json, last_build_json, start_job)
job_result_hash = Hash.new{|hsh,key| hsh[key] = {} } # Initialize a Hash for storing the results
start_job.send_request # Start the Jenkins Job
get_current_build_number = CheckJSON.get_from_json("#{current_build_json.send_request}", 'nextBuildNumber') # Fetch the nextBuildNumber as soon as the job starts (as that doesn't increment while it's in queue); the nextBuildNumber is going to be the currentBuildNumber
current_build_number = get_current_build_number.to_i # Save that nextBuildNumber to a separate variable for comparison
get_last_build_number = CheckJSON.get_from_json("#{last_build_json.send_request}", 'number')
get_last_build_duration = CheckJSON.get_from_json("#{last_build_json.send_request}", 'duration')
get_last_build_result = nil
loop do
Timeout::timeout(120) do
# EM.run do
sleep(5) # DEBUG
get_last_build_number = CheckJSON.get_from_json("#{last_build_json.send_request}", 'number')
get_last_build_result = CheckJSON.get_from_json("#{last_build_json.send_request}", 'result')
get_last_build_duration = CheckJSON.get_from_json("#{last_build_json.send_request}", 'duration')
# conn = EM::HttpRequest.new('http://localhost:9000/')
# start = Time.now
# r1 = conn.get :query => {delay: 1.5}, :keepalive => true
# r2 = conn.get :query => {delay: 1.0}
# r2.callback do
# p Time.now - start # => 1.5 - keep-alive + pipelining
# EM.stop
# end
# end
end
break if !get_last_build_result.nil? && !get_last_build_duration.zero? && (current_build_number == get_last_build_number) # End the loop when job is done
end
job_name = "#{CheckJSON.get_from_json("#{last_build_json.send_request}", 'fullDisplayName')}" # Fetch job's name
job_name = job_name.split(/ |\./) # Splits the job_name using '.' and ' ' as delimiters
job_result_hash['job_type'] = "#{job_name[3]}" # This takes the last part of the jenkins job name (Ex: Dev.eng-paas.devtools.TESTING_INTEGRATION_JOB)
job_result_hash['build_number'] = "#{current_build_number}" # Return the build number also which can be used in different situations
job_result_hash['job_duration'] = "#{CheckJSON.get_from_json("#{last_build_json.send_request}", 'duration')}" # Fetches the duration of job
job_result_hash['job_result'] = "#{CheckJSON.get_from_json("#{last_build_json.send_request}", 'result').downcase}" # Fetches if the job was successful/unstable/failure
return job_result_hash
end
end
根据索引中的时间戳对DataFrame的行(按月)进行分组。
df.index = start_date + pd.to_timedelta(df['Day created'], unit='D')
计算帐户数量(计数)和每个群组的已暂停帐户数量。
然后result = df.groupby(pd.TimeGrouper(freq='M')).agg(['count', 'sum'])
绘制条形图:
.agg(['count', 'sum'])
的产率 this