如何使用Python从CSV数据集中消除空值单元格?

时间:2018-10-04 06:10:00

标签: python pandas csv null

每行代表一个人(共315个),每列代表一个选择方案(共16个)。每个人对4个连续的选择方案进行随机响应。我想连续有四列,每个人都有自己的回答,并删除所有空白单元格。Image of the excel sheet

import pandas as pd
df = pd.read_csv(r"C:\Users\Admin\Desktop\Book2.csv")
for (r,c) in df.iterrows():
if df.iat[r,c] is not None:
    for i in range(4):
        print(str(df.iat[r,c+i]))

更新 我设法将数据按行明智地排入列表,并将它们分为4组(根据需要)。现在如何将元素保留为''以外的值?

import csv
rowdata = []
with open(r'C:\Users\ARPLAB31\Desktop\SPdata.csv') as inputfile:
    reader = csv.reader(inputfile)
    rowdata = list(reader)
r= []
for i in range(1,718,1):
    for j in range(28):
        if len(rowdata[i][j])!=0:
            r.append(rowdata[i][j])
cardref = [r[x:x+4] for x in range(0, len(r),4)] '''cardref contains the partitioned data.'''
print(cardref)

输出:

[['',',',','],['',',',','],['',',',',','],['', '','',''],[',',',',',],[',',',',','],['BB','BB','CC' ,'CC'],[''',',',','],['',',',','],[',',',',',','],[' ',',',','],[',',',',','],['CC','BB','CC','CC'],[',',' ,'',''],['',',',','],['',',',','],['',',',','], ['','','','],['CC','CC','CC','CC'],['',',',',','],['', '','',''],[',',',',',],['CC','CC','AA','CC'],[',',',' ',''],['',',',','],['',',',','],[',',',',',','],[' ',',',','],[',',',','],[',',',',',],[',',',', ''],['',',',','],['',',',','],['CC','BB','CC','CC'], [',',',','],[',',',','],[',',',','],[',',',' ',''],['','',','']]

3 个答案:

答案 0 :(得分:0)

使用df.isnull()[其中df是熊猫数据框]

一个很好的资源,可以在熊猫的数据框中找到空值。

https://dzone.com/articles/pandas-find-rows-where-columnfield-is-null

答案 1 :(得分:0)

您可以阅读每一行,获取非null字段,然后从那里创建新的CSV。

例如:

data = ",,2,2,2,2,,,"
arr = filter(None, data.split(",")) #removes null fields
",".join(arr) #"2,2,2,2"

答案 2 :(得分:0)

谢谢大家。我碰巧在以上所有评论的帮助下解决了我的问题。请在注释中提及对代码的任何更改。

import pandas
import csv
rowdata = []

''' READING CSV INTO LIST'''

with open('FILE.csv') as inputfile:
    reader = csv.reader(inputfile)
    rowdata = list(reader)

'''RECORDING THE POSITION OF NON-EMPTY ELEMENTS'''

r= []
for i in range(1,718,1):
    for j in range(28):
        if len(rowdata[i][j])!=0:
            r.append(j)
    continue

''' RE-GROUPING LIST AS LIST IN LIST'''

resp_index = [r[x:x+4] for x in range(0, len(r),4)]
print(resp_index)
print(len(resp_index))

'''ELIMINATING BLANK SPACES AND STORING INTO NEW LIST'''

s= []
for i in range(1,718,1):
    for j in range(28):
        if len(rowdata[i][j])!=0:
            s.append(rowdata[i][j])

''' RE-GROUPING LIST AS LIST IN LIST'''

resp_main = [s[x:x+4] for x in range(0, len(s),4)]
print(resp_main)
print(len(resp_main))
pd = pandas.DataFrame(resp_index)
pe = pandas.DataFrame(resp_main)

'''SAVING TO CSV FILES'''

pd.to_csv('INDEX.csv') 
pe.to_csv('RESPONSE.csv')