我要将从Windows文件系统上载的多个JSON文件复制到我的雪花表,但是COPY INTO命令失败,并出现Windows操作系统错误。
以这种方式从Windows本地文件暂存多个JSON文件-
> show(CVAR_ARCH1_h1)
*-------------------------------------*
* GARCH Roll *
*-------------------------------------*
No.Refits : 3078
Refit Horizon : 1
No.Forecasts : 3078
GARCH Model : sGARCH(1,0)
Distribution : norm
Forecast Density:
Mu Sigma Skew Shape Shape(GIG) Realized
1972-09-28 01:00:00 -0.0274 0.0113 0 0 0 0.0014
1972-09-29 01:00:00 -0.0261 0.0238 0 0 0 0.0125
1972-09-30 01:00:00 -0.0256 0.0341 0 0 0 -0.0031
1972-10-01 01:00:00 -0.0249 0.0218 0 0 0 -0.0115
1972-10-02 01:00:00 -0.0004 0.0138 0 0 0 0.0015
1972-10-03 01:00:00 -0.0313 0.0328 0 0 0 0.0137
..........................
Mu Sigma Skew Shape Shape(GIG) Realized
1981-02-25 01:00:00 8e-04 0.0062 0 0 0 -0.0124
1981-02-26 01:00:00 8e-04 0.0107 0 0 0 -0.0018
1981-02-27 01:00:00 5e-04 0.0064 0 0 0 -0.0255
1981-02-28 01:00:00 4e-04 0.0180 0 0 0 -0.0212
1981-03-01 01:00:00 7e-04 0.0158 0 0 0 0.0269
1981-03-02 01:00:00 8e-04 0.0184 0 0 0 -0.0175
Elapsed: 26.66044 mins
列出暂存文件会显示它们已准备好被复制-
cursor.execute("put file://C:\\Users\\nrajora\\data\\*.json @my_json_stage "
"auto_compress=true;")
输出:
cs.execute("list @my_json_stage")
all_rows = cs.fetchall()
for row in all_rows:
print("row: "+str(row))
...
('my_json_stage/xbg.json.gz', 40480, '07790f0478b333041e57435733a6d550', 'Wed, 12 Dec 2018 19:38:03 GMT')
('my_json_stage/xbu.json.gz', 108544, 'c7e164e041a459a3c2e28d6f73c14bc5', 'Wed, 12 Dec 2018 19:38:03 GMT')
('my_json_stage/xcd.json.gz', 60096, '6ce8cbb867f17077969a3110bfa51da9', 'Wed, 12 Dec 2018 19:38:03 GMT')
这是复制命令
('my_json_stage/xgh.json.gz', 31264, 'e46a75c0640fd59c256b654e02bf844a', 'Wed, 12 Dec 2018 19:38:03 GMT')
('my_json_stage/xgo.json.gz', 42752, 'aef9b6d6e536f794ce7f7e9429c46ff8', 'Wed, 12 Dec 2018 19:38:03 GMT')
这是错误
cs.execute("copy into UCLAIM_XML_JSON from @my_json_stage"
"pattern = '.*.json'")
似乎Snowflake正在尝试通过Windows%TEMP%目录上传多个文件, 并且无法清除该目录,因为该目录已在所有其他正在运行的程序中共享。
编辑: 我尝试了以下变通方法,该变通方法似乎可用于最多500个JSON文件(大约150MB的数据),但对于总大小超过180MB的数据文件却失败,并出现相同的错误。
python.exe C:\Users\nrajora\PycharmProjects\sf_poc_1\load_json_into_table_from_localfs.py
Traceback (most recent call last):
File "C:\Users\nrajora\PycharmProjects\sf_poc_1\load_json_into_table_from_localfs.py", line 25, in <module>
cs.execute("put file://C:\\Users\\nrajora\\Downloads\\OneClaimdata\\*.json @my_json_stage "
File "C:\Users\nrajora\AppData\Local\Programs\Python\Python37-32\lib\site-packages\snowflake\connector\cursor.py", line 519, in execute
sf_file_transfer_agent.execute()
File "C:\Users\nrajora\AppData\Local\Programs\Python\Python37-32\lib\site-packages\snowflake\connector\file_transfer_agent.py", line 194, in execute
self.upload(large_file_metas, small_file_metas)
File "C:\Users\nrajora\AppData\Local\Programs\Python\Python37-32\lib\site-packages\snowflake\connector\file_transfer_agent.py", line 215, in upload
self._upload_files_in_parallel(small_file_metas)
File "C:\Users\nrajora\AppData\Local\Programs\Python\Python37-32\lib\site-packages\snowflake\connector\file_transfer_agent.py", line 264, in _upload_files_in_parallel
target_meta)
File "C:\Users\nrajora\AppData\Local\Programs\Python\Python37-32\lib\multiprocessing\pool.py", line 290, in map
return self._map_async(func, iterable, mapstar, chunksize).get()
File "C:\Users\nrajora\AppData\Local\Programs\Python\Python37-32\lib\multiprocessing\pool.py", line 683, in get
raise self._value
File "C:\Users\nrajora\AppData\Local\Programs\Python\Python37-32\lib\multiprocessing\pool.py", line 121, in worker
result = (True, func(*args, **kwds))
File "C:\Users\nrajora\AppData\Local\Programs\Python\Python37-32\lib\multiprocessing\pool.py", line 44, in mapstar
return list(map(*args))
File "C:\Users\nrajora\AppData\Local\Programs\Python\Python37-32\lib\site-packages\snowflake\connector\file_transfer_agent.py", line 371, in upload_one_file
shutil.rmtree(tmp_dir)
File "C:\Users\nrajora\AppData\Local\Programs\Python\Python37-32\lib\shutil.py", line 507, in rmtree
return _rmtree_unsafe(path, onerror)
File "C:\Users\nrajora\AppData\Local\Programs\Python\Python37-32\lib\shutil.py", line 395, in _rmtree_unsafe
onerror(os.rmdir, path, sys.exc_info())
File "C:\Users\nrajora\AppData\Local\Programs\Python\Python37-32\lib\shutil.py", line 393, in _rmtree_unsafe
os.rmdir(path)
OSError: [WinError 145] The directory is not empty: 'C:\\Users\\nrajora\\AppData\\Local\\Temp\\tmp8iw2hs7i'
Process finished with exit code 1
这真的很奇怪。 有解决此问题的方法吗? 感谢任何帮助!