我有一个.csv文件,其中包含> 100列和数千行:
> Datetime A B C D E ... FA FB
> 01.01.2014 00:00 15,15 15,15 32,43 15,15 33,27 82,59 1,38
> 01.01.2014 01:00 12,96 12,96 32,49 12,96 30,07 82,59 1,38
> 01.01.2014 02:00 12,09 12,09 28,43 12,09 23,01 82,59 1,38
> 01.01.2014 03:00 11,7 11,7 27,63 11,7 11,04 82,59 1,38
> 01.01.2014 04:00 11,66 11,66 25,99 11,66 9,09 82,59 1,38
> ... ... ... ... ... ... ... ...
> 01.10.2018 23:00 9,85 9,85 17,2 9,85 10,44 92,15 1,09
现在,我需要按列提取此数据并将其导出到sqlite3数据库中,如下所示:
Datetime and A
Datetime and B
Datetime and C
...
Datetime and FB
为了获得如下所示的数据库表:
Datetime Value ID
> 01.01.2014 00:00 15,15 A
> 01.01.2014 01:00 12,96 A
> 01.01.2014 02:00 12,09 A
> ... ... ...
> 01.01.2014 00:00 15,15 FB
> 01.01.2014 01:00 12,96 FB
> 01.01.2014 02:00 12,09 FB
我设法使用以下代码写一些数据:
import sqlalchemy
from sqlalchemy import create_engine
from sqlalchemy.ext.declarative import declarative_base
from sqlalchemy import Column, Integer, String, Numeric, DateTime
from sqlalchemy.orm import sessionmaker
from datetime import datetime
import pandas as pd
Base = declarative_base()
# Declaration of the class in order to write into the database. This structure is standard and should align with SQLAlchemy's doc.
class Values_1(Base):
__tablename__ = 'Timeseries_Values'
ID = Column(Integer, primary_key=True)
Date = Column(DateTime, primary_key=True)
Value = Column(Numeric)
def main(fileToRead):
# Set up of the table in db and the file to import
fileToRead = r'data.csv'
tableToWriteTo = 'Timeseries_Values'
df = pd.read_csv(fileToRead, sep=';', decimal=',', parse_dates=['Date'], dayfirst=True)
df.columns = ['Datetime', 'A']
engine = create_engine('sqlite:///data.db')
conn = engine.connect()
metadata = sqlalchemy.schema.MetaData(bind=engine, reflect=True)
table = sqlalchemy.Table(tableToWriteTo, metadata, autoload=True)
# Open the session
Session = sessionmaker(bind=engine)
session = Session()
conn.execute(table.insert(), listToWrite)
session.commit()
session.close()
因此,这适用于单个组合(“ Datetime and A”),但如何自动添加所有其他组合?
提前很多
答案 0 :(得分:0)
这是部分答案,但看来问题的症结在于您需要melt
数据框:
df
A B C D E
Datetime
2014-01-01 00:00:00 15,15 15,15 32,43 15,15 33,27
2014-01-01 01:00:00 12,96 12,96 32,49 12,96 30,07
2014-01-01 02:00:00 12,09 12,09 28,43 12,09 23,01
2014-01-01 03:00:00 11,7 11,7 27,63 11,7 11,04
2014-01-01 04:00:00 11,66 11,66 25,99 11,66 9,09
重置并融化:
df1 = df.reset_index().melt('Datetime', var_name='ID', value_name='Value' )
Datetime ID Value
0 2014-01-01 00:00:00 A 15,15
1 2014-01-01 01:00:00 A 12,96
2 2014-01-01 02:00:00 A 12,09
3 2014-01-01 03:00:00 A 11,7
4 2014-01-01 04:00:00 A 11,66
5 2014-01-01 00:00:00 B 15,15
6 2014-01-01 01:00:00 B 12,96
7 2014-01-01 02:00:00 B 12,09
8 2014-01-01 03:00:00 B 11,7
9 2014-01-01 04:00:00 B 11,66
10 2014-01-01 00:00:00 C 32,43
11 2014-01-01 01:00:00 C 32,49
12 2014-01-01 02:00:00 C 28,43
13 2014-01-01 03:00:00 C 27,63
14 2014-01-01 04:00:00 C 25,99
15 2014-01-01 00:00:00 D 15,15
16 2014-01-01 01:00:00 D 12,96
17 2014-01-01 02:00:00 D 12,09
18 2014-01-01 03:00:00 D 11,7
19 2014-01-01 04:00:00 D 11,66
20 2014-01-01 00:00:00 E 33,27
21 2014-01-01 01:00:00 E 30,07
22 2014-01-01 02:00:00 E 23,01
23 2014-01-01 03:00:00 E 11,04
24 2014-01-01 04:00:00 E 9,09