将空熊猫DF与Google表格中单独的DF的行合并

时间:2019-08-23 16:28:43

标签: python pandas

我正在使用此Google表格工作表 (https://docs.google.com/spreadsheets/d/1I2VIGfJOyod-13Fke8Prn8IkhpgZWbirPBbosm8EFCc/edit?usp=sharing) 并且我想创建一个类似的数据框,该数据框仅由最后包含“ OOO”的单元格组成(为清楚起见,我用黄色突出显示了它们)。例如,以下是我想从中得到的一小段内容: (https://docs.google.com/spreadsheets/d/1rRWgESE7kPTvchOL0RxEcqjEnY9oUsiMnov-qagHg7I/edit?usp=sharing

基本上,我想在这里创建自己的“时间表”。

import os
import gspread
from oauth2client.service_account import ServiceAccountCredentials
import pandas as pd
from googleapiclient import discovery


DATA_DIR = '/path/here/'
scope = ['https://spreadsheets.google.com/feeds', 'https://www.googleapis.com/auth/drive',
         'https://www.googleapis.com/auth/spreadsheets']
path = os.path.join(DATA_DIR, 'client_secret.json')
credentials = ServiceAccountCredentials.from_json_keyfile_name(path, scope)
client = gspread.authorize(credentials)
service = discovery.build('sheets', 'v4', credentials=credentials)
spreadsheet_id = 'Dcon19'

debug = False

spreadsheet = client.open(spreadsheet_id).sheet1
data = spreadsheet.get_all_values()
index = str(data[0][0])
headers = data.pop(0)
df_index = []

def conv_pd_df():

    df = pd.DataFrame(data, columns=headers, index=None)
    df = df.set_index(index)
    df_index.append(df.index.values)

    mask = df.applymap(lambda x: key in str(x))
    df1 = df[mask.any(axis=1)]

    return df1


def highlight(df1):
    df2 = pd.DataFrame(columns=headers[1:], index=df_index) # blank dataframe
    df2 = df2.fillna('none', inplace=True)
    for col in df1: 
        update_row = df1[df1[col].str.contains("OOO")]
        if not update_row.empty:
            try:
                df2.update(update_row, overwrite=True)
            except AttributeError as e:
                print(f'Error {e}')
    df2.to_csv('/path/dcon.csv', header=True)


if __name__ == '__main__':
    if not debug:
        df1 = conv_pd_df()
        highlight(df1)

现在我唯一要回过头来的df2是空白数据帧,因为尝试保存生成的df2时出现错误AttributeError: 'NoneType' object has no attribute 'to_csv'

有人知道如何使它起作用,或者更有效的方法来实现这一目标吗?

这是我的第一个真正的个人项目,因此将不胜感激!

1 个答案:

答案 0 :(得分:0)

您引用的错误是由于您使用fillna的方式引起的。 df2.fillna('none', inplace=True)将返回None,这是您尝试发送df2.to_csv...

时看到的错误

为您的突出显示功能尝试类似的操作。

def highlight(df1):
    df2 = pd.DataFrame(columns=headers[1:], index=df_index) # blank dataframe
    df2.fillna('none', inplace=True)
    for col in df1: 
        update_row = df1[df1[col].str.contains("OOO")]
        if not update_row.empty:
            try:
                df2.update(update_row, overwrite=True)
            except AttributeError as e:
                print(f'Error {e}')
    df2.to_csv('/path/dcon.csv', header=True)