Google Cloud Function投掷怪异错误

时间:2019-01-17 21:43:53

标签: python python-3.x google-cloud-platform google-cloud-functions

这里有没有人熟悉Google Cloud Functions?我阅读了他们的文档,并在此基础上定制了脚本以尝试在其托管环境中工作。

https://cloud.google.com/functions/docs/concepts/python-runtime

所以,我的Python脚本看起来像这样。

def main():

    requests
    numpy
    pandas
    datetime
    requests
    pandas_gbq
    xml.etree.ElementTree


    # authentication: working....
    login = 'my_email' 
    password = 'my_password'


    AsOfDate = datetime.datetime.today().strftime('%m-%d-%Y')

    #step into URL
    REQUEST_URL = 'https://www.business.com/report-api/device=779142&rdate=Yesterday'
    response = requests.get(REQUEST_URL, auth=(login, password))
    xml_data = response.text.encode('utf-8', 'ignore') 

    #tree = etree.parse(xml_data)
    root = xml.etree.ElementTree.fromstring(xml_data)

    # start collecting root elements and headers for data frame 1
    desc = root.get("Description")
    frm = root.get("From")
    thru = root.get("Thru")
    loc = root.get("locations")
    loc = loc[:-1]
    df1 = pandas.DataFrame([['From:',frm],['Through:',thru],['Location:',loc]])
    df1.columns = ['S','Analytics']
    #print(df1)

    # start getting the analytics for data frame 2
    data=[['Goal:',root[0][0].text],['Actual:',root[0][1].text],['Compliant:',root[0][2].text],['Errors:',root[0][3].text],['Checks:',root[0][4].text]]
    df2 = pandas.DataFrame(data)
    df2.columns = ['S','Analytics']
    #print(df2)

    # merge data frame 1 with data frame 2
    df3 = df1.append(df2, ignore_index=True)
    #print(df3)

    # append description and today's date onto data frame
    df3['Description'] = desc
    df3['AsOfDate'] = AsOfDate


    # push from data frame, where data has been transformed, into Google BQ
    pandas_gbq.to_gbq(df3, 'Metrics', 'analytics', chunksize=None, reauth=False, if_exists='append', private_key=None, auth_local_webserver=False, table_schema=None, location=None, progress_bar=True, verbose=None)
    print('Execute Query, Done!!')

main()

if __name__ == '__main__':
    main()  

此外,我的requirements.txt看起来像这样。

requests
numpy
pandas
datetime
requests
pandas_gbq
xml.etree.ElementTree

过去2个月以上,我的脚本一直运行良好,但是我需要每天在笔记本电脑上运行它。为了摆脱这种手动过程,我正在尝试使其在云上运行。问题是我不断收到一条erorr消息,内容为:TypeError: main() takes 0 positional arguments but 1 was given

在我看来,似乎没有提供任何参数,也没有提供任何参数,但是Google却以某种方式说出了1个参数。我可以略微修改代码以使其正常工作,还是以某种方式绕过此看似良性的错误?谢谢。

2 个答案:

答案 0 :(得分:1)

以下内容将带您的代码,并将其更改为使用HTTP触发器在Google Cloud Functions中运行。然后,您可以使用Google Cloud Scheduler按计划调用函数。您还需要使用需要导入的模块创建一个# To find all the table table = soup.find('table', {'class': 'footable'}) # To get all rows in that table rows = table.find_all('tr') # A function to process each row def processRow(row): #All rows with hidden data dataFields = row.find_all('td', {'style': True} output = {} #Fixed index numbers are not ideal but in this case will work output['Discipline'] = dataFields[0].text output['Cogome'] = dataFields[2].text output['Cellulare'] = dataFields[8].text output['email'] = dataFields[10].text return output #Declaring a list to store all results results = [] #Iterating over all the rows and storing the processed result in a list for row in rows: results.append(processRow(row)) print(results) 。有关更多信息,请参见此document

requirements.txt

答案 1 :(得分:0)

您误解了Cloud Functions的工作原理。它不能让您简单地运行任意脚本。您编写触发器以响应HTTP请求,或者在Cloud项目中发生更改时。这似乎不是您在这里所做的。 Cloud Functions部署不使用main()。

您可能想阅读overview documentation,以了解云功能的用途。

如果您要定期运行某些内容,请考虑编写HTTP触发器,然后由类似cron的服务以所需的速率调用它。