Groupby大于Pandas中的速度非常慢

时间:2018-10-06 07:39:36

标签: python pandas

我将下表作为熊猫数据框。对于给定的零件,我需要计算所有Part Number,其中Net Sales大于Recommended Price

输入:above_master

  Short Number  Net Sales    Part Number   Recommended Price
0       MU2146     413.25      MU2146      385.949155
1       MU2146     433.12      MU2146      385.949155
2       MU2146     498.12      MU2146      385.949155
3       MU1609     146.07      MU1609      149.138978
4       MU1609     246.17      MU1609      149.138978

必需的输出

Part Number count
MU2146       3 
MU1609       1

使用的代码

for number in range(len(above_master.index)):
    cal_s1 = above_master[above_master['Net Sales'] > above_master.iloc[number]['Recommended Price'] ].groupby('Part Number')['Recommended Price'].count()
    cal_s2 = cal_s1.to_frame().reset_index()
    cal_s3 = cal_s2.loc[cal_s2['Part Number'] == above_master.iloc[number]['Part Number']]
    cal_s4 = cal_s4.append(cal_s3, ignore_index=True)

这很好,但是需要很长时间。

3 个答案:

答案 0 :(得分:4)

locsize一起使用:

df.loc[df['Recommended Price'].lt(df['Net Sales'])].groupby('Part Number').size()

Part Number
MU1609    1
MU2146    3
dtype: int64

答案 1 :(得分:2)

首先使用gt(大于)使用逻辑比较,然后将其分配到count列,然后使用Part Numberas_index=False和参数count使用逻辑比较,然后访问df['count'] = df['Net Sales'].gt(df['Recommended Price']) df.groupby(['Part Number'],as_index=False)['count'].sum() Part Number count 0 MU1609 1.0 1 MU2146 3.0 的总和为:

from PyQt5 import QtCore, QtGui, QtWidgets

class Thread(QtCore.QThread):
    def run(self):
        print("aaaaa")
        QtCore.QThread.sleep(2)
        self.finished.emit()

class Ui_MainWindow(object):
    def btn_event(self):
        self.pushButton.setEnabled(False)
        self.thread.start() #working well

def setupUi(self, MainWindow):
    MainWindow.setObjectName("MainWindow")
    MainWindow.resize(357, 158)
    self.centralwidget = QtWidgets.QWidget(MainWindow)
    self.centralwidget.setObjectName("centralwidget")
    self.pushButton = QtWidgets.QPushButton(self.centralwidget)
    self.pushButton.setGeometry(QtCore.QRect(220, 70, 75, 23))
    self.pushButton.setObjectName("pushButton")
    self.pushButton.clicked.connect(self.btn_event)

    self.thread = Thread()
    self.thread.finished.connect(lambda: self.pushButton.setEnabled(True))

    MainWindow.setCentralWidget(self.centralwidget)
    self.retranslateUi(MainWindow)
    QtCore.QMetaObject.connectSlotsByName(MainWindow)

def retranslateUi(self, MainWindow):
    _translate = QtCore.QCoreApplication.translate
    MainWindow.setWindowTitle(_translate("MainWindow", "MainWindow"))
    self.pushButton.setText(_translate("MainWindow", "PushButton"))

if __name__ == "__main__":
    import sys
    app = QtWidgets.QApplication(sys.argv)
    MainWindow = QtWidgets.QMainWindow()
    ui = Ui_MainWindow()
    ui.setupUi(MainWindow)
    MainWindow.show()
    sys.exit(app.exec_())

答案 2 :(得分:0)

这是使用熊猫系列value_counts的另一种方式

df['Part Number'][df['Recommended Price'] < df['Net Sales']].value_counts()