我在远程服务器上有一个文件,并且可以通过FTP访问该服务器。我需要逐行读取该TEXT文件的内容,并且基于少数条件,必须对其进行处理并将处理后的数据插入PostgreSQL数据库表中。
下面是我的示例代码,由于我是Python的新手,因此对此的指导将非常有帮助。 我已经搜索了一段时间的google和stackoverflow来找到解决方案,但是没有找到任何解决方案。如果有人可以引导它,那将有很大的帮助。
import psycopg2
import time
import os
import MySQLdb
import paramiko
from ftplib import FTP
from utils.config import Configuration as Config
from utils.utils import get_global_config
start_time = time.perf_counter()
# Postgresql connection
try:
cnx_psql = get_connection(get_global_config(), 'pg_dwh')
print ("DWH Connected")
except psycopg2.Error as e:
print('PSQL: Unable to connect!\n{0}').format(e)
sys.exit(1)
# Cursors initializations
cur_psql = cnx_psql.cursor()
try:
#filePath='''/Users/linu/Downloads/log'''
filePath='''/cmd/log/stk/log.txt'''
table='staging.stock_dump'
SQL="""DROP TABLE IF EXISTS """+ table + """;CREATE TABLE IF NOT EXISTS """+ table + """
(created_date TEXT, product_sku TEXT, previous_stock TEXT, current_stock TEXT );"""
cur_psql.execute(SQL)
cnx_psql.commit()
**This is where i have to read the TEXT file from the server on which i have FTP access.**
ftp = ftplib.FTP('my_ftp_host','my_ftp_username','my_ftp_password')
myfiles = ftp.dir()
print myfiles
read_file = open(filePath, "r")
my_file_data = read_file.readlines()
print my_file_data[0]
print my_file_data[1]
** Below is the logic i want to achieve,After reading the remote file i have to process each line of that TEXT file and check if there is a word "Stock:"in that and if it is ,then split that line based of ""(space) and insert the split values to the destination database.**
for line in remote_file:
if 'Stock:' in line:
fields=line.split(" ")
date_part1=fields[0]
date_part2=fields[1][:-1]
sku=fields[3]
prev_stock=fields[5]
current_stock=fields[7]
if prev_stock.strip()==current_stock.strip():
continue
else:
cur_psql.execute("insert into " + table+"(created_date, product_sku, previous_stock , current_stock)" + " select CAST('" + date_part1+ " "+ date_part2 + "' AS TEXT)" +", CAST('"+sku+"' AS TEXT),CAST('" + prev_stock +"' AS TEXT),CAST('" +current_stock + "' AS TEXT);")
finally:
read_file.close()
ftp.quit()
cnx_psql.commit()
cur_psql.close()
cnx_psql.close()
print("Data loaded to DWH from text file")
print("Data porting took %s seconds to finish---" % (time.perf_counter() - start_time))
except (Exception, psycopg2.Error) as error:
print ("Error while fetching data from PostgreSQL", error)
print("Error adding information.")
quit()