我有这个数据集代表有关火星陨石坑的信息。我只对纬度,经度和可见图层列数感兴趣。我试图将纬度分为10度组:-90到-80,-80到-70等,同时获取图层列值(0,1,2,3,4,5)并将它们转换为列本身获取每个10度组的每个图层列值的value_counts表。
把头发拉出来,我似乎尝试了一切我能理解的东西。
答案 0 :(得分:3)
这对你有用吗?
import pandas as pd
import random
# generate random data
N = 100
longitudes = [random.randint(-20, 89) for _ in xrange(N)]
layers = [random.randint(0, 5) for _ in xrange(N)]
data = pd.DataFrame({'LONGITUDE_CIRCLE_IMAGE': longitudes, 'NUMBER_LAYERS': layers})
def get_degree_group(longitude, mn=-20, mx=90, delta_deg=10):
"""calculate the layer from the given longitude"""
return (longitude - mn) / delta_deg
def make_table(df):
# make a new column by calculating the degree group from longitude column
df['degree_group'] = df.LONGITUDE_CIRCLE_IMAGE.apply(get_degree_group)
# count the number of craters with properties (deg_grp, num_lyr)
s = df.groupby(['degree_group', 'NUMBER_LAYERS']).size()
# s is a pandas Series where the index is in the form: (deg_grp, num_lyr)
# and the values are the counts of crates in that category
#
# We want to convert the series into a table where num_lyr values are columns
# This task is done with unstack method
table = s.unstack('NUMBER_LAYERS')
# there are some (deg_grp, num_lyr) cases for which there are no existing craters
# Pandas put NaN for those cases. It might be better to put 0 into those cells
table.fillna(0, inplace = True)
return table
make_table(data)
答案 1 :(得分:1)
使用pd.cut
制作小组,pivot_table
计算。
数据样本:
lat=rand(3000)*180-90
layers=randint(0,6,3000)
data=pd.DataFrame({'lat':lat,'layers':layers})
组成18个小组:
data['groups'] = pd.cut(lat,linspace(-90,90,19))
还有一张桌子:
data.pivot_table(index='groups',columns='layers',aggfunc='count',fill_value=0)
lat
layers 0 1 2 3 4 5
groups
(-90, -80] 4 1 2 1 1 0
(-80, -70] 1 0 0 2 2 3
(-70, -60] 4 3 2 4 3 4
(-60, -50] 6 2 1 1 2 3
(-50, -40] 2 3 4 2 2 4
(-40, -30] 4 3 4 2 4 4
(-30, -20] 2 5 2 2 3 2
(-20, -10] 4 2 6 3 5 2
(-10, 0] 3 4 2 3 2 1
(0, 10] 5 3 4 3 4 7
(10, 20] 3 3 2 2 2 3
(20, 30] 2 1 1 4 3 5
(30, 40] 1 2 0 2 2 3
(40, 50] 1 3 3 2 3 4
(50, 60] 6 0 2 4 1 6
(60, 70] 3 3 2 5 1 5
(70, 80] 1 4 5 3 2 2
(80, 90] 1 7 3 2 4 2