我有一个缺少大量数据的数据集。示例数据文件:
a,b,c,w
a1,,,
a2,b1,c1,
a2,b1,c2,
a2,,,
a3,b2,c3,
a4,,,
a5,b1,c1,100
a6,b2,c4,
a7,b1,c2,214.285714285714
a7,b1,c2,245.454545454545
a7,b1,c2,292.105263157895
a7,b1,c2,
a8,b1,c2,
a9,b2,c3,
,b3,,
,,c4,
,,c5,
我正在努力创建一个如下所示的数据透视表:
w
mean
a a1 a2 a3 a4 a5 a6 a7 a8 a9
b c
NaN NaN NaN NaN NaN NaN NaN NaN NaN
b1 c1 NaN NaN NaN NaN 100.0 NaN NaN NaN NaN
b1 c2 NaN NaN NaN NaN NaN NaN 250.615174 NaN NaN
b2 c3 NaN NaN NaN NaN NaN NaN NaN NaN NaN
b2 c4 NaN NaN NaN NaN NaN NaN NaN NaN NaN
b3 NaN NaN NaN NaN NaN NaN NaN NaN NaN
c4 NaN NaN NaN NaN NaN NaN NaN NaN NaN
c5 NaN NaN NaN NaN NaN NaN NaN NaN NaN
我不在乎空白是在顶部还是在底部。关键是每个A值都显示为一列,对于行,只显示现有的B,C对。
以下代码:
dataframe = pd.read_csv('test/data/sparse.csv')
pd.set_option('display.width', 1000)
print(dataframe)
col_names = ['a']
row_names = ['b', 'c']
value_names = ['w']
aggregates = {'w': ['mean']}
pivot = pd.pivot_table(
dataframe,
index=row_names,
columns=col_names,
values=value_names,
aggfunc=aggregates
)
创建一个数据透视表,如:
w
mean
a a5 a7
b c
b1 c1 100.0 NaN
c2 NaN 250.615174
b2 c3 NaN NaN
c4 NaN NaN
如果我将所有None值设置为空白,请通过:
for c in dataframe:
if str(dataframe[c].dtype) in ('object', 'string_', 'unicode_'):
dataframe[c].fillna(value='', inplace=True)
然后我得到
w
mean
a a5 a7
b c
NaN NaN
c4 NaN NaN
c5 NaN NaN
b1 c1 100.0 NaN
c2 NaN 250.615174
b2 c3 NaN NaN
c4 NaN NaN
b3 NaN NaN
它可以获取我的行而不是我的列。如果我将pivotna = False添加到pivot_table调用,那么我将获得所有列,但是我也获得了原始数据集中不存在的行对。
有什么建议吗?
由于
答案 0 :(得分:2)
如果您使用nan
而不是空格,那么groupby
+ unstack
可以在这里工作。首先,使用a
将列b
,c
和astype(str)
转换为字符串。这会导致groupby
在分组数据时不再忽略NaN。
cols = ['a', 'b', 'c']
df[cols] = df[cols].astype(str)
df.groupby(cols)\
.w.mean()\
.unstack(0)\
.drop('nan', 1)
a a1 a2 a3 a4 a5 a6 a7 a8 a9
b c
b1 c1 NaN NaN NaN NaN 100.0 NaN NaN NaN NaN
c2 NaN NaN NaN NaN NaN NaN 250.615174 NaN NaN
b2 c3 NaN NaN NaN NaN NaN NaN NaN NaN NaN
c4 NaN NaN NaN NaN NaN NaN NaN NaN NaN
b3 nan NaN NaN NaN NaN NaN NaN NaN NaN NaN
nan c4 NaN NaN NaN NaN NaN NaN NaN NaN NaN
c5 NaN NaN NaN NaN NaN NaN NaN NaN NaN
nan NaN NaN NaN NaN NaN NaN NaN NaN NaN
答案 1 :(得分:1)
达到目标输出的一种方法是将所有唯一的import React, { Component } from 'react';
import { Button, Modal, ModalHeader, ModalBody, ModalFooter } from 'reactstrap';
class AddBooking extends Component {
constructor(props) {
super(props);
this.state = {
pitch: this.props.pitch,
firstName: null,
lastName: null,
email: null,
arrivalDate: this.props.dayQuery,
departureDate: this.props.dayQuery,
noDays: 1,
pitchType: "Standard",
adults: 0,
children: 0,
infants: 0,
hookUp: 0,
dogs: 0,
extraInfo: null,
price: 0,
deposit: 0,
paid: 0,
subTotal: 0,
total: 0,
}
this._handleDisplay = this._handleDisplay.bind(this);
this._getRefs = this._getRefs.bind(this);
this._handleInputChange = this._handleInputChange.bind(this);
this._calculatePrice = this._calculatePrice.bind(this);
}
componentDidUpdate() {
this._calculatePrice(this.state);
}
_getRefs(e) {
var tempBooking = {
pitch: parseInt(this.state.pitch),
firstName: this.state.firstName,
lastName: this.state.lastName,
email: this.state.email,
arrivalDate: this.state.arrivalDate,
departureDate: this.state.departureDate,
pitchType: this.state.pitchType,
adults: parseInt(this.state.adults),
children: this.state.children,
infants: this.state.infants,
hookUp: this.state.hookUp,
dogs: this.state.dogs,
extraInfo: this.state.extraInfo,
price: this.state.price,
deposit: this.state.deposit,
paid: this.state.paid
}
this.props.addBooking(tempBooking);
e.preventDefault();
this._handleDisplay();
}
_calculatePrice(data) {
var price = this.props.bookingPrice.in_season;
var a = (data.adults * price.adults);
var c = (data.children * price.children);
var i = (data.infants * price.infants);
var h = (data.hookUp * price.hookUp);
var d = (data.dogs * price.dogs);
var days = data.noDays;
var subTotal = a + c + i + h + d;
var total = subTotal * days;
this.setState({
subTotal: subTotal,
total: total
});
}
_handleDisplay() {
this.props.addDisplay();
}
_handleInputChange(event) {
const target = event.target;
const value = target.type === 'checkbox' ? target.checked : target.value;
const name = target.name;
var partialState = {};
partialState[name] = value;
this.setState(partialState);
}
render(){
var price = this.props.bookingPrice.in_season;
return (
<Modal isOpen={this.props.formVisibility} toggle={this._handleDisplay}>
<ModalHeader toggle={this._handleDisplay}>Add Booking</ModalHeader>
<ModalBody>
<div className="modal-body">
<div className="row">
<div className="col-7">
<form id="add-booking-form">
<i className="fa fa-address-card float-left mr-2 mt-1" aria-hidden="true"></i>
<h5>Personal</h5>
<div className="form-group row mt-3">
<div className="form-label-group col-6">
<input onChange={this._handleInputChange} id="firstName" className="form-control" ref="firstName" name="firstName" type="text" placeholder="First Name"/>
<label htmlFor="firstName" className="mx-3">First Name</label>
</div>
<div className="form-label-group col-6">
<input onChange={this._handleInputChange} id="lastName" className="form-control" ref="lastName" name="lastName" type="text" placeholder="Last Name"/>
<label htmlFor="lastName" className="mx-3">Last Name</label>
</div>
</div>
<div className="form-label-group">
<input onChange={this._handleInputChange} id="email" className="form-control" ref="email" name="email" type="email" placeholder="Email Address"/>
<label htmlFor="email">Email Address</label>
</div>
<hr className="mb-4 mt-4"></hr>
<i className="fa fa-calendar float-left mr-2 mt-1" aria-hidden="true"></i>
<h5>Pitch</h5>
<div className="form-group row mt-3">
<div className="form-label-group col-6">
<input defaultValue={this.props.pitch} onChange={this._handleInputChange} id="pitch" className="form-control" ref="pitch" name="pitch" type="number" placeholder="Pitch"/>
<label htmlFor="pitch" className="mx-3">Pitch</label>
</div>
<div className="form-label-group col-6">
<input onChange={this._handleInputChange} id="pitchType" className="form-control" ref="pitchType" name="pitchType" type="text" placeholder="Pitch Type"/>
<label htmlFor="pitchType" className="mx-3">Pitch Type</label>
</div>
</div>
<div className="form-group row">
<div className="form-label-group col-6">
<input defaultValue={this.props.dayQuery} onChange={this._handleInputChange} id="arrivalDate" className="form-control" ref="arrivalDate" name="arrivalDate" type="date" placeholder="Arrival"/>
<label htmlFor="arrivalDate" className="mx-3">Arrival</label>
</div>
<div className="form-label-group col-6">
<input defaultValue={this.props.dayQuery} onChange={this._handleInputChange} id="departureDate" className="form-control" ref="departureDate" name="departureDate" type="date" placeholder="Departure"/>
<label htmlFor="departureDate" className="mx-3">Departure</label>
</div>
</div>
<hr className="mb-4"></hr>
<i className="fa fa-users float-left mr-2 mt-1" aria-hidden="true"></i>
<h5>Group Details</h5>
<div className="form-group row mt-3">
<div className="form-label-group col-4">
<input onChange={this._handleInputChange} id="adults" className="form-control" ref="adults" name="adults" type="number" placeholder="Adults"/>
<label htmlFor="adults" className="mx-3">Adults</label>
<small id="emailHelp" className="form-text text-muted">18+</small>
</div>
<div className="form-label-group col-4">
<input onChange={this._handleInputChange} id="children" className="form-control" ref="children" name="children" type="number" placeholder="Children"/>
<label htmlFor="children" className="mx-3">Children</label>
<small id="emailHelp" className="form-text text-muted">12-17</small>
</div>
<div className="form-label-group col-4">
<input onChange={this._handleInputChange} id="infants" className="form-control" ref="infants" name="infants" type="number" placeholder="Infants"/>
<label htmlFor="infants" className="mx-3">Infants</label>
<small id="emailHelp" className="form-text text-muted">4+</small>
</div>
</div>
<div className="form-group row mt-3">
<div className="form-label-group col-6">
<input onChange={this._handleInputChange} id="hookUp" className="form-control" ref="hookUp" name="hookUp" type="number" placeholder="Hook Up"/>
<label htmlFor="hookUp" className="mx-3">Hook Up</label>
</div>
<div className="form-label-group col-6">
<input onChange={this._handleInputChange} id="dogs" className="form-control" ref="dogs" name="dogs" type="number" placeholder="Dogs"/>
<label htmlFor="dogs" className="mx-3">Dogs</label>
</div>
</div>
<div className="form-group row mt-3">
<div className="form-group col-12">
<textarea className="form-control" id="exampleFormControlTextarea1" placeholder="Extra Info" rows="3"></textarea>
</div>
</div>
<div className="form-group row">
<label className="col-2 col-form-label">Price</label>
<div className="col-10">
<input onChange={this._handleInputChange} className="form-control" ref="price" name="price" type="number"/>
</div>
</div>
<div className="form-group row">
<label className="col-2 col-form-label">Deposit</label>
<div className="col-10">
<input onChange={this._handleInputChange} className="form-control" ref="deposit" name="deposit" type="number"/>
</div>
</div>
<div className="form-group row">
<label className="col-2 col-form-label">Paid</label>
<div className="col-10">
<input onChange={this._handleInputChange} className="form-control" ref="paid" name="paid" type="number"/>
</div>
</div>
</form>
</div>
<div className="col-5">
<i className="fa fa-calculator float-left mr-2 mt-1" aria-hidden="true"></i>
<h4>Booking Price</h4>
<small id="passwordHelpBlock" className="form-text text-muted">
Summer Tariff & Forest Pitch
</small>
<ul className="list-group list-group-flush mt-3">
<li className={"list-group-item d-flex justify-content-between align-items-center " + (this.state.adults ? 'show' : 'hidden')}>
Adults x{this.state.adults}
<span className="pull-right">£{price.adults * this.state.adults}</span>
</li>
<li className={"list-group-item d-flex justify-content-between align-items-center " + (this.state.children ? 'show' : 'hidden')}>
Children x3
<span className="pull-right">£{price.children * this.state.children}</span>
</li>
<li className={"list-group-item d-flex justify-content-between align-items-center " + (this.state.infants ? 'show' : 'hidden')}>
Infants x2
<span className="pull-right">£{price.infants * this.state.infants}</span>
</li>
<li className="list-group-item d-flex justify-content-between align-items-center">
Subtotal (cost per night)
<span className="pull-right">£0</span>
</li>
<li className="list-group-item d-flex justify-content-between align-items-center font-weight-bold">
Total
<span className="pull-right">£0</span>
</li>
</ul>
</div>
</div>
</div>
</ModalBody>
<ModalFooter>
<Button color="danger" data-dismiss="modal" onClick={this._handleDisplay}>Close</Button>
<Button color="success" onClick={this._getRefs}>Save</Button>
</ModalFooter>
</Modal>
)
}
}
export default AddBooking;
和b
对收集为元组:
c
...然后使用tups = df[['b', 'c']].drop_duplicates().apply(tuple, axis=1)
# 0 (nan, nan)
# 1 (b1, c1)
# 2 (b1, c2)
# 4 (b2, c3)
# 7 (b2, c4)
# 14 (b3, nan)
# 15 (nan, c4)
# 16 (nan, c5)
致电.pivot_table
,并立即使用您的b-c元组重新索引:
dropna=True