我正在尝试创建多个数据框,它们都位于不同的文件夹和.csv文件中。我遇到的问题是,当我尝试为数据创建字典时,python像这样解析我的数据框:
Printing Fruits Dictionairy
{'0': Name Date Time Open High Low Close Volume VWAP Trades
0 Orange 20200430 15:30:00 5.70 5.97 5.65 5.75 1000 5.60 55
1 Orange 20200430 17:00:00 5.65 5.95 5.50 5.80 1200 5.65 68
2 Orange 20200430 20:00:00 5.50 5.83 5.44 5.60 1300 5.73 71
3 Orange 20200430 22:00:00 5.55 5.58 5.35 5.57 1400 5.78 81
4 Orange 20200501 15:30:00 5.50 5.85 5.45 5.70 1500 5.73 95
5 Orange 20200501 17:00:00 5.65 5.70 5.50 5.60 1600 5.65 54
6 Orange 20200501 20:00:00 5.80 5.85 5.45 5.81 1700 5.73 41
7 Orange 20200501 22:00:00 5.60 5.84 5.45 5.65 1800 5.75 62
8 Orange 20200504 15:30:00 5.40 5.87 5.45 5.75 1900 5.83 84
9 Orange 20200504 17:00:00 5.50 5.75 5.40 5.60 2000 5.72 94
10 Orange 20200504 20:00:00 5.80 5.83 5.44 5.50 2100 5.40 55
11 Orange 20200504 22:00:00 5.40 5.58 5.37 5.80 2200 5.35 87, '1': Name Date Time Open High Low Close Volume VWAP Trades
0 Apple 20200504 10:00:00 3.70 3.97 3.65 3.75 1000 3.60 55
1 Apple 20200504 12:00:00 3.65 3.95 3.50 3.80 1200 3.65 68
2 Apple 20200504 14:00:00 3.50 3.83 3.44 3.60 1300 3.73 71
3 Apple 20200504 16:00:00 3.55 3.58 3.35 3.57 1400 3.78 81
4 Apple 20200505 10:00:00 3.50 3.85 3.45 3.70 1500 3.73 95
5 Apple 20200505 12:00:00 3.65 3.70 3.50 3.60 1600 3.65 54
6 Apple 20200505 14:00:00 3.80 3.85 3.45 3.81 1700 3.73 41
7 Apple 20200505 16:00:00 3.60 3.84 3.45 3.65 1800 3.75 62
8 Apple 20200506 10:00:00 3.40 3.87 3.45 3.75 1900 3.83 84
9 Apple 20200506 12:00:00 3.50 3.75 3.40 3.60 2000 3.72 94
10 Apple 20200506 14:00:00 3.80 3.83 3.44 3.50 2100 3.40 55
11 Apple 20200506 16:00:00 3.40 3.58 3.37 3.80 2200 3.35 87}
因此,每个数据帧的所有数据都存储在一个“单元”(行和列)中,而不是像这样解析
df(idx) = {{Name: Orange, Orange, Orange}, {Date: 202004320, 20200430, 20200430}, {Time: 15:30:00, 17:00:00, 20:00:00}}
df(idx) = {{Name: Apple, Apple, Apple}, {Date: 202004320, 20200430, 20200430}, {Time: 15:30:00, 17:00:00, 20:00:00}}
我创建了一个示例文件来说明我的问题,这也是从中获取示例的地方。
import pandas as pd
import os
#Opening 'Test Tracker.xlsx' to find entities to download
TEST = pd.ExcelFile("Trackers\TEST Tracker.xlsx")
df1 = TEST.parse("Entries")
values1 = df1[['Name', 'Location', 'Date', 'TimeO', 'TimeC', 'Check_2',
'Open', 'High', 'Low', 'Close', 'Volume', 'VWAP', '$Volume', 'Trades']]
#Searching for every row that contains the value 'X' in the column 'Check_2'
rdf1 = values1[values1.Check_2.str.contains("X")]
#Printing dataframe to check
print("First Dataframe")
print(rdf1)
#creating a dictionary for the dataframes
Fruits = {}
#Generating dynamic dataframes for each row in rdf1
for idx, rows in rdf1.iterrows():
fle = os.path.join('Entities', rows.Location, rows.Name, 'TwoHours.csv')
col_list = ['Name', 'Date', 'Time', 'Open', 'High', 'Low', 'Close', 'Volume', 'VWAP', 'Trades']
df3 = pd.read_csv(fle, usecols=col_list, sep=";")
Fruits['' + str(idx)] = df3[col_list]
print("Printing Fruits Dictionairy")
print(Fruits)
从中检索信息的第一个数据框如下所示:
First Dataframe
Name Location Date TimeO ... Volume VWAP $Volume Trades
0 Orange New York 20200501 15:30:00 ... NaN NaN NaN NaN
1 Apple Minsk 20200505 15:30:00 ... NaN NaN NaN NaN
以防有人怀疑。
我希望这里有人可以帮助我,因为我已经为此苦苦挣扎了很长时间,并尝试了许多都失败了的事情。