这里有没有人熟悉Google Cloud Functions?我阅读了他们的文档,并在此基础上定制了脚本以尝试在其托管环境中工作。
https://cloud.google.com/functions/docs/concepts/python-runtime
所以,我的Python脚本看起来像这样。
def main():
requests
numpy
pandas
datetime
requests
pandas_gbq
xml.etree.ElementTree
# authentication: working....
login = 'my_email'
password = 'my_password'
AsOfDate = datetime.datetime.today().strftime('%m-%d-%Y')
#step into URL
REQUEST_URL = 'https://www.business.com/report-api/device=779142&rdate=Yesterday'
response = requests.get(REQUEST_URL, auth=(login, password))
xml_data = response.text.encode('utf-8', 'ignore')
#tree = etree.parse(xml_data)
root = xml.etree.ElementTree.fromstring(xml_data)
# start collecting root elements and headers for data frame 1
desc = root.get("Description")
frm = root.get("From")
thru = root.get("Thru")
loc = root.get("locations")
loc = loc[:-1]
df1 = pandas.DataFrame([['From:',frm],['Through:',thru],['Location:',loc]])
df1.columns = ['S','Analytics']
#print(df1)
# start getting the analytics for data frame 2
data=[['Goal:',root[0][0].text],['Actual:',root[0][1].text],['Compliant:',root[0][2].text],['Errors:',root[0][3].text],['Checks:',root[0][4].text]]
df2 = pandas.DataFrame(data)
df2.columns = ['S','Analytics']
#print(df2)
# merge data frame 1 with data frame 2
df3 = df1.append(df2, ignore_index=True)
#print(df3)
# append description and today's date onto data frame
df3['Description'] = desc
df3['AsOfDate'] = AsOfDate
# push from data frame, where data has been transformed, into Google BQ
pandas_gbq.to_gbq(df3, 'Metrics', 'analytics', chunksize=None, reauth=False, if_exists='append', private_key=None, auth_local_webserver=False, table_schema=None, location=None, progress_bar=True, verbose=None)
print('Execute Query, Done!!')
main()
if __name__ == '__main__':
main()
此外,我的requirements.txt看起来像这样。
requests
numpy
pandas
datetime
requests
pandas_gbq
xml.etree.ElementTree
过去2个月以上,我的脚本一直运行良好,但是我需要每天在笔记本电脑上运行它。为了摆脱这种手动过程,我正在尝试使其在云上运行。问题是我不断收到一条erorr消息,内容为:TypeError: main() takes 0 positional arguments but 1 was given
在我看来,似乎没有提供任何参数,也没有提供任何参数,但是Google却以某种方式说出了1个参数。我可以略微修改代码以使其正常工作,还是以某种方式绕过此看似良性的错误?谢谢。
答案 0 :(得分:1)
以下内容将带您的代码,并将其更改为使用HTTP触发器在Google Cloud Functions中运行。然后,您可以使用Google Cloud Scheduler按计划调用函数。您还需要使用需要导入的模块创建一个# To find all the table
table = soup.find('table', {'class': 'footable'})
# To get all rows in that table
rows = table.find_all('tr')
# A function to process each row
def processRow(row):
#All rows with hidden data
dataFields = row.find_all('td', {'style': True}
output = {}
#Fixed index numbers are not ideal but in this case will work
output['Discipline'] = dataFields[0].text
output['Cogome'] = dataFields[2].text
output['Cellulare'] = dataFields[8].text
output['email'] = dataFields[10].text
return output
#Declaring a list to store all results
results = []
#Iterating over all the rows and storing the processed result in a list
for row in rows:
results.append(processRow(row))
print(results)
。有关更多信息,请参见此document。
requirements.txt
答案 1 :(得分:0)
您误解了Cloud Functions的工作原理。它不能让您简单地运行任意脚本。您编写触发器以响应HTTP请求,或者在Cloud项目中发生更改时。这似乎不是您在这里所做的。 Cloud Functions部署不使用main()。
您可能想阅读overview documentation,以了解云功能的用途。
如果您要定期运行某些内容,请考虑编写HTTP触发器,然后由类似cron的服务以所需的速率调用它。