我需要做的是重构脚本,将其转换为可以添加到更大进程的方法。
步骤A
重点是 - 我正在处理一项数据处理任务 - 第一步是创建以下形式的数据:
properties.proprietor_id
步骤B
然后,步骤B将对上述步骤的结果进行操作(由下面的 PROCESS 描述),它将输出如此数据(最终数据表示)。
这意味着 PROCESS 会创建 STEP A 中显示的数据。
最终的数据表示如下所示:
3|Victoria|[51.503378, -0.139134]|2673
52|Cubitt Town|[51.505199, -0.018848]|23
5|United Kingdom|[54.75844, -2.69531]|459
6|London|[51.50853, -0.12574]|346
296|Bucharest|[44.43225, 26.10626]|9
352|Vilich-Müldorf|[50.75024, 7.15283]|4
48|Gut Scheibenhardt|[49.001249, 8.412378]|3
314|Westerham|[48.0601, 11.62219]|9
45|Honartsdeich|[53.557429, 9.987297]|34
9779|Martinsried|[48.137418, 11.555737]|11
343|Brussels|[50.85045, 4.34878]|27
563|Russell Square|[51.519403, -0.133906]|20
2|Germany|[51.5, 10.5]|20
11|Farringdon|[51.51807, -0.10852]|154
609|Fröttmaning|[48.16652, 11.59038]|3
执行该魔法的脚本(创建上面显示的最终输出数据)以这种方式显示:
code: GB-ENG, jobs: 2673
code: GB-ENG, jobs: 23
code: GB-ENG, jobs: 459
code: GB-ENG, jobs: 346
code: RO-B, jobs: 9
code: DE-NW, jobs: 4
code: DE-BW, jobs: 3
code: DE-BY, jobs: 9
code: DE-HH, jobs: 34
code: DE-BY, jobs: 11
code: BE-BRU, jobs: 27
code: GB-ENG, jobs: 20
code: DE-TH, jobs: 20
code: GB-ENG, jobs: 154
code: DE-BY, jobs: 3
但我需要做的是把它变成一个方法,所以我可以把它融入这个过程:
PROCESS
import json
import requests
from collections import defaultdict
from pprint import pprint
def hasNumbers(inputString):
return any(char.isdigit() for char in inputString)
# open up the output of 'data-processing.py'
with open('job-numbers-by-location.txt') as data_file:
# print the output to a file
with open('phase_ii_output.txt', 'w') as output_file_:
for line in data_file:
identifier, name, coords, number_of_jobs = line.split("|")
coords = coords[1:-1]
lat, lng = coords.split(",")
# print("lat: " + lat, "lng: " + lng)
response = requests.get("http://api.geonames.org/countrySubdivisionJSON?lat="+lat+"&lng="+lng+"&username=s.matthew.english").json()
codes = response.get('codes', [])
for code in codes:
if code.get('type') == 'ISO3166-2':
country_code = '{}-{}'.format(response.get('countryCode', 'UNKNOWN'), code.get('code', 'UNKNOWN'))
if not hasNumbers( country_code ):
# print("code: " + country_code + ", jobs: " + number_of_jobs)
output_file_.write("code: " + country_code + ", jobs: " + number_of_jobs)
output_file_.close()
但问题是 - 我一直在努力将该脚本变成一个方法。
如何将该脚本重构为可包含在流程中的独立方法?
可以找到与此任务相关的所有数据here on my GitHub page