泛化Python移植脚本

时间:2019-07-15 13:14:55

标签: python

我想将数据从位于远程服务器中的文本文件移植到另一台远程服务器。但是在这种情况下,我必须使用此语句手动设置文件名(filepath = filePath ='''/ Users / linu / Downloads / log'''),如果文件名不同,它将无法正常工作。那么,有没有在不指定文件名的情况下对该脚本进行泛化?

我已经尝试过像这样使用(filepath = filePath ='''/ Users / linu / Downloads / *。txt'''),但遇到以下错误,

Error while fetching data from PostgreSQL [Errno 2] No such file or directory: '/Users/linu/Downloads/*.txt'
Error adding  information.

这是正确的方法吗?或我该如何实现?/还有其他可以概括的东西吗?

注意::我使用的是Mac系统,文件属性显示为“ TextEdit文档”,找不到文件的类型(很抱歉,这是a幸,我是是Mac环境中的新功能。

import psycopg2
import time

start_time = time.perf_counter()
try:
  conn = psycopg2.connect(host="localhost", database="postgres", user="postgres",
                         password="postgres", port="5432")
  print('DB connected')

except (Exception, psycopg2.Error) as error:
        # Confirm unsuccessful connection and stop program execution.
        print ("Error while fetching data from PostgreSQL", error)
        print("Database connection unsuccessful.")
        quit()        
try:

    filepath = filePath='''/Users/linu/Downloads/log''' 

    table='staging.stock_dump'

    SQL="""DROP TABLE IF EXISTS """+  table + """;CREATE TABLE IF NOT EXISTS """+ table + """
      (created_date TEXT, product_sku TEXT, previous_stock TEXT, current_stock TEXT );"""

    cursor = conn.cursor()
    cursor.execute(SQL)
    conn.commit()
    with open(filePath, 'r') as file:
     for line in file:
        if 'Stock:' in line:
            fields=line.split(" ")
            date_part1=fields[0]
            date_part2=fields[1][:-1]
            sku=fields[3]
            prev_stock=fields[5]
            current_stock=fields[7]
            if prev_stock.strip()==current_stock.strip():
                continue
            else:
               #print("insert into " + table+"(created_date, product_sku, previous_stock , current_stock)" + " select CAST('" + date_part1+ " "+ date_part2 + "' AS TEXT)" +", CAST('"+sku+"' AS TEXT),CAST('" + prev_stock +"' AS TEXT),CAST('" +current_stock  + "' AS TEXT) ;")
               cursor.execute("insert into " + table+"(created_date, product_sku, previous_stock , current_stock)" + " select CAST('" + date_part1+ " "+ date_part2 + "' AS TEXT)" +", CAST('"+sku+"' AS TEXT),CAST('" + prev_stock +"' AS TEXT),CAST('" +current_stock  + "' AS TEXT);")

    conn.commit()       
    cursor.close()
    conn.close()
    print("Data loaded to DWH from text file")
    print("Data porting took %s seconds to finish---" % (time.perf_counter() - start_time))

except (Exception, psycopg2.Error) as error:
        print ("Error while fetching data from PostgreSQL", error)
        print("Error adding  information.")
        quit()

1 个答案:

答案 0 :(得分:0)

您可以将os.listdir()函数与os.path.splitext()一起使用,以判断目录中的哪些文件是文本文件。此时,您可以根据所需条件选择要传输的文件,也可以在for循环中传输所有个文件:

possible_files = [f for f in os.listdir('/Users/linu/Downloads') if os.path.splitext(f) == ".txt"]
# possible_files now contains paths of all .txt files in /users/linu/downloads
# you could search through that list somehow to find the one you wanted, or just
#   take possible_files[0] if you wanted (the first .txt file in the directory),
#   or do your code for all of them, as I've demonstrated below
for possible_file in possible_files:
    filePath = os.path.join('/Users/linu/Downloads', possible_file)
    # and now the rest of the code you already wrote
    table = 'staging.stock_dump'
    ...
    conn.close()

在我看来,考虑到您的情况,您的文件实际上可能不是.txt文件-它可能只是一个没有文件扩展名的普通文件。在这种情况下,您可能想使用条件if os.path.splitext(f) == ''