我想在终端中明确指定我的火车和测试集。而不是在终端中运行.ipynb文件时在代码中指定它们。 截至目前,这就是我正在做的事情。
# FOR TRAINING DATA
# LISTING OUT ALL FILES PRESENT IN FOLDER PATH
path = "C:/Users/****/****/Latest_Datasets/base_out"
files = os.listdir(path)
df = pd.DataFrame()
# APPENDING THE ALL DATA FROM THE FOLDER PATH TO DATAFRAME
for f in files:
data = pd.read_csv(f, 'Sheet1',delimiter='\t',usecols=['details','amount','category'],encoding=("utf-8"))
df = df.append(data)
df.reset_index(level=0, inplace=True)
df['index1'] = df.index
df=df[['index1','amount','details','category']]
# FOR TEST DATA
test_data=pd.read_csv('testfile.csv',
delimiter='\t',usecols=['xn_details','xn_amount','category'],encoding='utf-8')
x_train, y_train = (df.details, df.category )
x_test, y_test = (test_data.details, test_data.category)
# After this I apply my model and get my classifications for my test.details
我想将训练数据和测试数据作为参数提供给终端,而不是在脚本中指定。 我该怎么做呢。 提前致谢
答案 0 :(得分:0)
您可以导入sys模块,然后使用sys.argv命令在命令行中传递参数。
import sys
#everything else remains the same
.
.
.
test_data=pd.read_csv(sys.argv[1],
delimiter='\t',usecols=['xn_details','xn_amount','category'],encoding='utf-8')
sys.argv[0] #the first argument stores the python file name such as "test.py"
sys.argv[1] #this will store the csv file that you want to pass as an argument to pd.read_csv(). You need to pass this as a command line argument.
因此,在命令行中,您应该执行以下行:
C:\>python test.py testfile.csv #test.py is the name of your python file *.py