当我尝试使用dockerUpdateLatest
从pd.read_sql_query
生成的SQL查询到数据帧时,我的字符串值将转换为pd.DataFrame
。
我尝试使用dtypes设置每一列的类型
nan
SQL查询输出:
SQL_Query = pd.read_sql_query('''SELECT [CircuitID], [Status],
[LatestJiraTicket], [MrcNew]
FROM CircuitInfoTable
WHERE ([Status] = 'Active')
OR ([Status] = 'Pending')
OR ([Status] = 'Planned')''', conn)
# print(SQL_Query)
cdf = pd.DataFrame(SQL_Query, columns=['CID', 'Status', 'JiraTicket', 'MrcNew'])
DataFrame输出:
0 OH1004-01 ... NaN
1 OH1004-02 ... NaN
2 OH1005-01 ... NaN
3 OH1005-02 ... NaN
4 AL1001-01 ... NaN
5 AL1001-02 ... NaN
6 AL1007-01 ... NaN
7 AL1007-02 ... NaN
8 NC1001-01 ... NaN
9 NC1001-02 ... NaN
10 NC1001-03 ... NaN
11 NC1001-04 ... NaN
12 NC1001-05 ... NaN
13 NC1001-06 ... NaN
14 (ommited on purpose) ... 5200.0
15 MO001-02 ... NaN
16 OR020-01 ... 8000.0
17 MA004-01 ... 6500.0
18 MA004-02 ... 6500.0
19 OR004-01 ... 10500.0
20 (ommited on purpose) ... 3975.0
21 OR007-01 ... 2500.0
22 (ommited on purpose) ... 9200.0
23 (ommited on purpose) ... 15000.0
24 (ommited on purpose) ... 5750.0
25 CA1005-02 ... 47400.0
26 CA1005-03 ... 47400.0
27 CA1005-04 ... 47400.0
28 CA1005-05 ... 47400.0
29 CA1006-01 ... 0.0
答案 0 :(得分:1)
基本上,您在pandas.DataFrame
中错误地使用了 columns 自变量,其中该行距指定要在结果输出中选择的列(而不是重命名)。根据您的查询,没有 CID 或 JiraTicket ,因此它们会迁移所有缺少的值。
可能您打算重命名列。考虑使用带列别名的SQL重命名或使用rename
或set_axis
的熊猫重命名:
SELECT [CircuitID] AS [CID],
[Status],
[LatestJiraTicket] AS JiraTicket,
[MrcNew]
FROM CircuitInfoTable
WHERE ([Status] = 'Active')
OR ([Status] = 'Pending')
OR ([Status] = 'Planned')
熊猫
cdf = (pd.read_sql_query(...original query...)
.rename(columns={'CircuitID': 'CID', 'LatestJiraTicket': 'JiraTicket'})
)
cdf = (pd.read_sql_query(...original query...)
.set_axis(['CID', 'Status', 'JiraTicket', 'MrcNew'], axis='columns', inplace=False)
)