如何通过在Graphlab SFrame中划分两列来创建新列?

时间:2015-11-18 23:53:30

标签: python lambda dataframe calculated-columns graphlab

给出一个Graphlab SFrame:

+-------+------------+---------+-----------+
| Store |    Date    |  Sales  | Customers |
+-------+------------+---------+-----------+
|   1   | 2015-07-31 |  5263.0 |   555.0   |
|   2   | 2015-07-31 |  6064.0 |   625.0   |
|   3   | 2015-07-31 |  8314.0 |   821.0   |
|   4   | 2015-07-31 | 13995.0 |   1498.0  |
|   3   | 2015-07-20 |  4822.0 |   559.0   |
|   2   | 2015-07-10 |  5651.0 |   589.0   |
|   4   | 2015-07-11 | 15344.0 |   1414.0  |
|   5   | 2015-07-23 |  8492.0 |   833.0   |
|   2   | 2015-07-19 |  8565.0 |   687.0   |
|   10  | 2015-07-09 |  7185.0 |   681.0   |
+-------+------------+---------+-----------+
[986159 rows x 4 columns]

如何通过将Sales除以每行的客户来添加“每个客户的销售额”列?

我尝试了以下操作,但它们不起作用(sf是我的SFrame

sf['salespercustomer'] = sf.apply(lambda x: sf['Sales']/sf['Customers'])

有趣的是,我得到了SArray的输出:

sf['Sales'] / sf['Customers']

但这并没有真正帮助将列添加回sf,所以这不起作用=(:

sf['salescustomer'] = sf['Sales'] / sf['Customers']

2 个答案:

答案 0 :(得分:1)

最后一行代码可以解决问题,但是你说你的SFrame被称为sf,而不是train。当我使用sf进行尝试时,它可以正常工作。

答案 1 :(得分:1)

我就是这样做的。

public partial class frmWorkshopSelector : Form
{

    public frmWorkshopSelector()
    {
        InitializeComponent();
      }

    private void btnExit_Click(object sender, EventArgs e)
    {
        this.Close();     //When clicking the exit button, the program will close
    }

    private void btncalc_Click(object sender, EventArgs e)
    {
        int wsregistration = 0;
        int lcost = 0;
        const decimal DAYS = 3;


        //For the following if statements, depending on what workshop and location is selected,
        //their correstponding registration and lodging fees will be displayed

        {
            if (rbtHandlingStress.Checked == true)
            {
                wsregistration = 1000;
            }
            else if (rbtSupervisionSkills.Checked == true)
            {
                wsregistration = 1500;
            }
            else if (rbtTimeManagement.Checked == true)
            {
                wsregistration = 800;
            }

            else
            MessageBox.Show("Please Select a Workshop");
            lblTotalCost.Text = "";
            lblLodgingCost.Text = "";
            lblRegistrationCost.Text = "";
        }

        {
            if (rbtAustin.Checked == true)
            {
                lcost = 150;
            }
            else if (rbtChicago.Checked == true)
            {
                lcost = 225;
            }
            else if (rbtDallas.Checked == true)
            {
                lcost = 175;
            }
            else
            {
                MessageBox.Show("Please Select a Location");
                lblRegistrationCost.Text = " ";
                lblTotalCost.Text = " ";
                lblLodgingCost.Text = " ";
            }
        }

        lblRegistrationCost.Text = wsregistration.ToString("C");
        lblLodgingCost.Text = lcost.ToString("C");
        lblTotalCost.Text = (wsregistration + (lcost * DAYS)).ToString("C");

    }

    private void btnReset_Click(object sender, EventArgs e)
    { 
        //unchecks all radio buttons as well as clears out the previous calculations
        lblRegistrationCost.Text = "";
        lblLodgingCost.Text = "";
        lblTotalCost.Text = "";
        rbtHandlingStress.Checked = false;
        rbtSupervisionSkills.Checked = false;
        rbtTimeManagement.Checked = false;
        rbtAustin.Checked = false;
        rbtChicago.Checked = false;
        rbtDallas.Checked = false;
    }
}

FWIW,您的示例将整个sf传递给apply lambda作为参数x,但是您使用了sf。我的理解是在lambda函数中不知道sf,但x的别名是。

FWIW,您可以像这样进行单列操作:

sf['salespercustomer'] = sf['Sales','Customers'].apply(lambda row: row['Sales']/row['Customers'])

由于只指定了一列,因此无需在lambda函数中指定列。