在Pandas df中,我试图在多列中删除重复项。每行很多数据是NaN
。
这只是一个例子,数据是混合包,所以存在许多不同的组合。
df.drop_duplicates()
IDnum name formNumber
1 NaN AP GROUP 028-11964
2 1364615.0 AP GROUP NaN
3 NaN AP GROUP NaN
希望的输出:
IDnum name formNumber
1 1364615.0 AP GROUP 028-11964
编辑:
如果df.drop_duplicates()
看起来像这样,会改变解决方案吗? :
df.drop_duplicates()
IDnum name formNumber
0 NaN AP GROUP 028-11964
1 1364615.0 AP GROUP 028-11964
2 1364615.0 AP GROUP NaN
3 NaN AP GROUP NaN
答案 0 :(得分:2)
您可以使用<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<meta http-equiv="X-UA-Compatible" content="ie=edge">
<title>Document</title>
</head>
<body>
<h1 class="abc">Hello</h1>
<h1 class="abc">Hello</h1>
<h1 class="abc">Hello</h1>
<h1 class="abc">Hello</h1>
<h1 class="abc">Hello</h1>
<h1 class="abc">Hello</h1>
<h1 class="abc">Hello</h1>
<h1 class="abc">Hello</h1>
<h1 class="abc">Hello</h1>
<h1 class="abc">Hello</h1>
<h1 class="abc">Hello</h1>
<h1 class="abc">Hello</h1>
<h1 class="abc">Hello</h1>
<script>
var eles=document.getElementsByClassName('abc');
var tmp = eles.prototype.length;
Object.defineProperty(eles, 'length', {
get: function(e) {
console.log("je suis la", this);
return tmp.call(this);
}
})
var len = eles.length;
// for(i=0; i < eles.length;i++){
// }
</script>
</body>
</html>
+ groupby
first
答案 1 :(得分:1)
您需要:
df.bfill().ffill().drop_duplicates()
输出:
IDnum name formNumber
0 1364615.0 AP GROUP 028-11964