给出以下数据框:
import pandas as pd
import numpy as np
df = pd.DataFrame({'A':[1,1,np.nan],
'B':[2.2,np.nan,2.2]})
df
A B
0 1.0 2.2
1 1.0 NaN
2 NaN 2.2
如果我想将A列中的NaN值替换为该列(1)中重复的值,并对B列执行相同的操作,那么我需要使用哪种fillna()?
A B
0 1.0 2.2
1 1.0 NaN
2 NaN 2.2
寻找通用解决方案,因为我确实拥有数千行。 提前谢谢!
答案 0 :(得分:2)
// create allocation/pointer using OpenGL
GLuint vertexArray;
glGenBuffers( 1,&vertexArray);
glBindBuffer( GL_ARRAY_BUFFER, vertexArray);
glBufferData( GL_ARRAY_BUFFER, numVertices * 16, NULL, GL_DYNAMIC_COPY );
cudaGLRegisterBufferObject( vertexArray );
void * vertexPointer;
// Map the buffer to CUDA
cudaGLMapBufferObject(&ptr, vertexBuffer);
// Run a kernel to create/manipulate the data
MakeVerticiesKernel<<<gridSz,blockSz>>>(ptr,numVerticies);
// Unmap the buffer
cudaGLUnmapbufferObject(vertexBuffer);
// Bind the Buffer
glBindBuffer( GL_ARRAY_BUFFER, vertexBuffer );
// Enable Vertex and Color arrays
glEnableClientState( GL_VERTEX_ARRAY );
glEnableClientState( GL_COLOR_ARRAY );
// Set the pointers to the vertices and colors
glVertexPointer(3,GL_FLOAT,16,0);
glColorPointer(4,GL_UNSIGNED_BYTE,16,12);
glDrawArrays(GL_POINTS,0, numVerticies);
SwapBuffer();
可以使用键的字典,其中键是列名。
假设您要使用最重复的值填充列,您可以使用以下方法计算字典:
fillna
答案 1 :(得分:2)
为什么不简单:
df.fillna(method='ffill')
# df = pd.DataFrame({'A': [1, 1, np.nan, 2], 'B': [2.2, np.nan, 2.2, 1.9]})
# df.fillna(method='ffill')
# A B
#0 1 2.2
#1 1 2.2
#2 1 2.2
#3 2 1.9
答案 2 :(得分:0)
import itertools
import operator
def most_common(L):
# get an iterable of (item, iterable) pairs
SL = sorted((x, i) for i, x in enumerate(L))
# print 'SL:', SL
groups = itertools.groupby(SL, key=operator.itemgetter(0))
# auxiliary function to get "quality" for an item
def _auxfun(g):
item, iterable = g
count = 0
min_index = len(L)
for _, where in iterable:
count += 1
min_index = min(min_index, where)
# print 'item %r, count %r, minind %r' % (item, count, min_index)
return count, -min_index
# pick the highest-count/earliest item
return max(groups, key=_auxfun)[0]
然后只需添加
df['A'].fillna(most_common(df['A'].values.tolist()))