pandas - 包括所有列和行对值

时间:2018-02-02 22:35:38

标签: python pandas dataframe pivot-table

我有一个缺少大量数据的数据集。示例数据文件:

a,b,c,w
a1,,,
a2,b1,c1,
a2,b1,c2,
a2,,,
a3,b2,c3,
a4,,,
a5,b1,c1,100
a6,b2,c4,
a7,b1,c2,214.285714285714
a7,b1,c2,245.454545454545
a7,b1,c2,292.105263157895
a7,b1,c2,
a8,b1,c2,
a9,b2,c3,
,b3,,
,,c4,
,,c5,

我正在努力创建一个如下所示的数据透视表:

         w
      mean
a       a1  a2  a3  a4     a5  a6          a7  a8  a9
b  c
       NaN NaN NaN NaN    NaN NaN         NaN NaN NaN
b1 c1  NaN NaN NaN NaN  100.0 NaN         NaN NaN NaN
b1 c2  NaN NaN NaN NaN    NaN NaN  250.615174 NaN NaN
b2 c3  NaN NaN NaN NaN    NaN NaN         NaN NaN NaN
b2 c4  NaN NaN NaN NaN    NaN NaN         NaN NaN NaN
b3     NaN NaN NaN NaN    NaN NaN         NaN NaN NaN
   c4  NaN NaN NaN NaN    NaN NaN         NaN NaN NaN
   c5  NaN NaN NaN NaN    NaN NaN         NaN NaN NaN

我不在乎空白是在顶部还是在底部。关键是每个A值都显示为一列,对于行,只显示现有的B,C对。

以下代码:

dataframe = pd.read_csv('test/data/sparse.csv')
pd.set_option('display.width', 1000)
print(dataframe)
col_names = ['a']
row_names = ['b', 'c']
value_names = ['w']
aggregates = {'w': ['mean']}

pivot = pd.pivot_table(
    dataframe,
    index=row_names,
    columns=col_names,
    values=value_names,
    aggfunc=aggregates
)

创建一个数据透视表,如:

           w
        mean
a         a5          a7
b  c
b1 c1  100.0         NaN
   c2    NaN  250.615174
b2 c3    NaN         NaN
   c4    NaN         NaN

如果我将所有None值设置为空白,请通过:

for c in dataframe:
    if str(dataframe[c].dtype) in ('object', 'string_', 'unicode_'):
        dataframe[c].fillna(value='', inplace=True)

然后我得到

           w            
        mean            
a         a5          a7
b  c                    
         NaN         NaN
   c4    NaN         NaN
   c5    NaN         NaN
b1 c1  100.0         NaN
   c2    NaN  250.615174
b2 c3    NaN         NaN
   c4    NaN         NaN
b3       NaN         NaN

它可以获取我的行而不是我的列。如果我将pivotna = False添加到pivot_table调用,那么我将获得所有列,但是我也获得了原始数据集中不存在的行对。

有什么建议吗?

由于

2 个答案:

答案 0 :(得分:2)

如果您使用nan而不是空格,那么groupby + unstack可以在这里工作。首先,使用a将列bcastype(str)转换为字符串。这会导致groupby在分组数据时不再忽略NaN。

cols = ['a', 'b', 'c']
df[cols] = df[cols].astype(str)

df.groupby(cols)\
  .w.mean()\
  .unstack(0)\
  .drop('nan', 1)

a        a1  a2  a3  a4     a5  a6          a7  a8  a9
b   c                                                 
b1  c1  NaN NaN NaN NaN  100.0 NaN         NaN NaN NaN
    c2  NaN NaN NaN NaN    NaN NaN  250.615174 NaN NaN
b2  c3  NaN NaN NaN NaN    NaN NaN         NaN NaN NaN
    c4  NaN NaN NaN NaN    NaN NaN         NaN NaN NaN
b3  nan NaN NaN NaN NaN    NaN NaN         NaN NaN NaN
nan c4  NaN NaN NaN NaN    NaN NaN         NaN NaN NaN
    c5  NaN NaN NaN NaN    NaN NaN         NaN NaN NaN
    nan NaN NaN NaN NaN    NaN NaN         NaN NaN NaN

答案 1 :(得分:1)

达到目标输出的一种方法是将所有唯一的import React, { Component } from 'react'; import { Button, Modal, ModalHeader, ModalBody, ModalFooter } from 'reactstrap'; class AddBooking extends Component { constructor(props) { super(props); this.state = { pitch: this.props.pitch, firstName: null, lastName: null, email: null, arrivalDate: this.props.dayQuery, departureDate: this.props.dayQuery, noDays: 1, pitchType: "Standard", adults: 0, children: 0, infants: 0, hookUp: 0, dogs: 0, extraInfo: null, price: 0, deposit: 0, paid: 0, subTotal: 0, total: 0, } this._handleDisplay = this._handleDisplay.bind(this); this._getRefs = this._getRefs.bind(this); this._handleInputChange = this._handleInputChange.bind(this); this._calculatePrice = this._calculatePrice.bind(this); } componentDidUpdate() { this._calculatePrice(this.state); } _getRefs(e) { var tempBooking = { pitch: parseInt(this.state.pitch), firstName: this.state.firstName, lastName: this.state.lastName, email: this.state.email, arrivalDate: this.state.arrivalDate, departureDate: this.state.departureDate, pitchType: this.state.pitchType, adults: parseInt(this.state.adults), children: this.state.children, infants: this.state.infants, hookUp: this.state.hookUp, dogs: this.state.dogs, extraInfo: this.state.extraInfo, price: this.state.price, deposit: this.state.deposit, paid: this.state.paid } this.props.addBooking(tempBooking); e.preventDefault(); this._handleDisplay(); } _calculatePrice(data) { var price = this.props.bookingPrice.in_season; var a = (data.adults * price.adults); var c = (data.children * price.children); var i = (data.infants * price.infants); var h = (data.hookUp * price.hookUp); var d = (data.dogs * price.dogs); var days = data.noDays; var subTotal = a + c + i + h + d; var total = subTotal * days; this.setState({ subTotal: subTotal, total: total }); } _handleDisplay() { this.props.addDisplay(); } _handleInputChange(event) { const target = event.target; const value = target.type === 'checkbox' ? target.checked : target.value; const name = target.name; var partialState = {}; partialState[name] = value; this.setState(partialState); } render(){ var price = this.props.bookingPrice.in_season; return ( <Modal isOpen={this.props.formVisibility} toggle={this._handleDisplay}> <ModalHeader toggle={this._handleDisplay}>Add Booking</ModalHeader> <ModalBody> <div className="modal-body"> <div className="row"> <div className="col-7"> <form id="add-booking-form"> <i className="fa fa-address-card float-left mr-2 mt-1" aria-hidden="true"></i> <h5>Personal</h5> <div className="form-group row mt-3"> <div className="form-label-group col-6"> <input onChange={this._handleInputChange} id="firstName" className="form-control" ref="firstName" name="firstName" type="text" placeholder="First Name"/> <label htmlFor="firstName" className="mx-3">First Name</label> </div> <div className="form-label-group col-6"> <input onChange={this._handleInputChange} id="lastName" className="form-control" ref="lastName" name="lastName" type="text" placeholder="Last Name"/> <label htmlFor="lastName" className="mx-3">Last Name</label> </div> </div> <div className="form-label-group"> <input onChange={this._handleInputChange} id="email" className="form-control" ref="email" name="email" type="email" placeholder="Email Address"/> <label htmlFor="email">Email Address</label> </div> <hr className="mb-4 mt-4"></hr> <i className="fa fa-calendar float-left mr-2 mt-1" aria-hidden="true"></i> <h5>Pitch</h5> <div className="form-group row mt-3"> <div className="form-label-group col-6"> <input defaultValue={this.props.pitch} onChange={this._handleInputChange} id="pitch" className="form-control" ref="pitch" name="pitch" type="number" placeholder="Pitch"/> <label htmlFor="pitch" className="mx-3">Pitch</label> </div> <div className="form-label-group col-6"> <input onChange={this._handleInputChange} id="pitchType" className="form-control" ref="pitchType" name="pitchType" type="text" placeholder="Pitch Type"/> <label htmlFor="pitchType" className="mx-3">Pitch Type</label> </div> </div> <div className="form-group row"> <div className="form-label-group col-6"> <input defaultValue={this.props.dayQuery} onChange={this._handleInputChange} id="arrivalDate" className="form-control" ref="arrivalDate" name="arrivalDate" type="date" placeholder="Arrival"/> <label htmlFor="arrivalDate" className="mx-3">Arrival</label> </div> <div className="form-label-group col-6"> <input defaultValue={this.props.dayQuery} onChange={this._handleInputChange} id="departureDate" className="form-control" ref="departureDate" name="departureDate" type="date" placeholder="Departure"/> <label htmlFor="departureDate" className="mx-3">Departure</label> </div> </div> <hr className="mb-4"></hr> <i className="fa fa-users float-left mr-2 mt-1" aria-hidden="true"></i> <h5>Group Details</h5> <div className="form-group row mt-3"> <div className="form-label-group col-4"> <input onChange={this._handleInputChange} id="adults" className="form-control" ref="adults" name="adults" type="number" placeholder="Adults"/> <label htmlFor="adults" className="mx-3">Adults</label> <small id="emailHelp" className="form-text text-muted">18+</small> </div> <div className="form-label-group col-4"> <input onChange={this._handleInputChange} id="children" className="form-control" ref="children" name="children" type="number" placeholder="Children"/> <label htmlFor="children" className="mx-3">Children</label> <small id="emailHelp" className="form-text text-muted">12-17</small> </div> <div className="form-label-group col-4"> <input onChange={this._handleInputChange} id="infants" className="form-control" ref="infants" name="infants" type="number" placeholder="Infants"/> <label htmlFor="infants" className="mx-3">Infants</label> <small id="emailHelp" className="form-text text-muted">4+</small> </div> </div> <div className="form-group row mt-3"> <div className="form-label-group col-6"> <input onChange={this._handleInputChange} id="hookUp" className="form-control" ref="hookUp" name="hookUp" type="number" placeholder="Hook Up"/> <label htmlFor="hookUp" className="mx-3">Hook Up</label> </div> <div className="form-label-group col-6"> <input onChange={this._handleInputChange} id="dogs" className="form-control" ref="dogs" name="dogs" type="number" placeholder="Dogs"/> <label htmlFor="dogs" className="mx-3">Dogs</label> </div> </div> <div className="form-group row mt-3"> <div className="form-group col-12"> <textarea className="form-control" id="exampleFormControlTextarea1" placeholder="Extra Info" rows="3"></textarea> </div> </div> <div className="form-group row"> <label className="col-2 col-form-label">Price</label> <div className="col-10"> <input onChange={this._handleInputChange} className="form-control" ref="price" name="price" type="number"/> </div> </div> <div className="form-group row"> <label className="col-2 col-form-label">Deposit</label> <div className="col-10"> <input onChange={this._handleInputChange} className="form-control" ref="deposit" name="deposit" type="number"/> </div> </div> <div className="form-group row"> <label className="col-2 col-form-label">Paid</label> <div className="col-10"> <input onChange={this._handleInputChange} className="form-control" ref="paid" name="paid" type="number"/> </div> </div> </form> </div> <div className="col-5"> <i className="fa fa-calculator float-left mr-2 mt-1" aria-hidden="true"></i> <h4>Booking Price</h4> <small id="passwordHelpBlock" className="form-text text-muted"> Summer Tariff & Forest Pitch </small> <ul className="list-group list-group-flush mt-3"> <li className={"list-group-item d-flex justify-content-between align-items-center " + (this.state.adults ? 'show' : 'hidden')}> Adults x{this.state.adults} <span className="pull-right">£{price.adults * this.state.adults}</span> </li> <li className={"list-group-item d-flex justify-content-between align-items-center " + (this.state.children ? 'show' : 'hidden')}> Children x3 <span className="pull-right">£{price.children * this.state.children}</span> </li> <li className={"list-group-item d-flex justify-content-between align-items-center " + (this.state.infants ? 'show' : 'hidden')}> Infants x2 <span className="pull-right">£{price.infants * this.state.infants}</span> </li> <li className="list-group-item d-flex justify-content-between align-items-center"> Subtotal (cost per night) <span className="pull-right">£0</span> </li> <li className="list-group-item d-flex justify-content-between align-items-center font-weight-bold"> Total <span className="pull-right">£0</span> </li> </ul> </div> </div> </div> </ModalBody> <ModalFooter> <Button color="danger" data-dismiss="modal" onClick={this._handleDisplay}>Close</Button> <Button color="success" onClick={this._getRefs}>Save</Button> </ModalFooter> </Modal> ) } } export default AddBooking; b对收集为元组:

c

...然后使用tups = df[['b', 'c']].drop_duplicates().apply(tuple, axis=1) # 0 (nan, nan) # 1 (b1, c1) # 2 (b1, c2) # 4 (b2, c3) # 7 (b2, c4) # 14 (b3, nan) # 15 (nan, c4) # 16 (nan, c5) 致电.pivot_table,并立即使用您的b-c元组重新索引:

dropna=True