Question

我试图查看一些参考资料，在其中我可以创建一个基于另一列进行分类的额外列。我已经pandas categorical尝试过文档，并且堆栈溢出似乎没有这个，但是我认为必须这样做，但是也许我使用了错误的搜索标签？

例如

div p.test {
    background: red;
}

div p.test:last-of-type {
  background: none;
}

问题出在这里是我稍后像这样创建了一个熊猫交叉表：

Size    Size_cat
10      0-50
50      0-50
150     50-500
450     50-500
5000    1000-9000
10000   >9000
notice that the size category 500-1000 is missing (but no number corresponds to that)

对该表进行一些编辑后，我会得到以下结果：

summary_table = pd.crosstab(index[res_sum["Type"],res_sum["Size"]],columns=[res_sum["Found"]],margins=True)
summary_table = summary_table.div(summary_table["All"] / 100, axis=0)

问题在于，现在（Size）仅将大小放在此处，因此该表的大小可能有所不同。如果数据中缺少5000-DEL，则该列也将消失，然后DUP具有6个类别和DEL5。此外，如果我添加更多大小，则该表将变得非常大。因此，我想对大小进行分类，但即使其中一些是空的，也始终保留相同的类别。

我希望我很清楚，因为这很难解释。这是我已经尝试过的：

Found                 Exact        Near          No
Type        Size                                   
DEL         50        80         20            0
            100       60         40            0
            500       80         20            0
            1000      60         40            0
            5000      40         60            0
            10000     20         80            0
DEL_Total             56.666667   43.333333    0
DUP         50         0           0         100
            100        0           0         100
            500        0         100           0
            1000       0         100           0
            5000       0         100           0
            10000     20          80           0
DUP_Total              3.333333   63.333333   33.333333

我得到了数字类别，但是现在它们当然取决于最高的数字，并且类别会根据数据而变化。另外，我仍然需要将它们链接到熊猫的“大小”列。这是行不通的。

highest_size = res['Size'].max()
categories = int(math.ceil(highest_size / 100.0) * 100.0)
categories = int(categories / 10)

labels = ["{0} - {1}".format(i, i + categories) for i in range(0, highest_size, categories)]
print(highest_size)
print(categories)
print(labels)
10000
1000
['0 - 1000', '1000 - 2000', '2000 - 3000', '3000 - 4000', '4000 - 5000', '5000 - 6000', '6000 - 7000', '7000 - 8000', '8000 - 9000', '9000 - 10000']

如果可能的话，我想创建自己的类别，而不是像上面第一个示例中那样使用range来获取相同的步骤。（否则，以100的步长达到10000将花费很长时间，而以1000的步长将丢失较小区域中的大量数据）

Answer 1

请参见下面的模型，以帮助您理解逻辑。基本上，您可以通过使用private void ScrollViewer_PreviewMouseWheel(object sender, MouseWheelEventArgs e) { ScrollViewer scrollViewer = (ScrollViewer)sender; if (e.Delta < 0) { scrollViewer.LineRight(); } else { scrollViewer.LineLeft(); } e.Handled = true; }（甚至是cut或lambda）将Score分成自定义组，并将值传递给函数map。让我知道它是否有效。

GroupMapping

对熊猫数据进行数值分类

1 个答案: