Question

 c1,  c2,  c3,  c4,  c5,  ID     seq
2020 2020 2020 2020 2020 1212     1
2021 2020 2021 2020 2021 1212     2
2022 2020 2022 2020 2022 1212     3
2023 2020 2023 2020 2023 1313     1
2024 2020 2024 2020 2024 1313     2
2025 2020 2025 2020 2025 1313     3
2026 2020 2026 2020 2026 1313     4
2026 2020 2026 2020 2026 1313     5

正在导入的数据：

# Python code to demonstrate SQL to fetch data.

# importing the module
import sqlite3
import numpy as np
import pandas as pd
from sklearn.linear_model import LinearRegression
import matplotlib.pyplot as plt
from scipy.stats import chisquare

# connect withe the myTable database
connection = sqlite3.connect(r"C:\Users\Aidan\Desktop\cep_db.db")

# cursor object
crsr = connection.cursor()


dog= crsr.execute("Select s, ei, ki FROM cep_db_lite1_vc WHERE s IN ('d')")
ans= crsr.fetchall() 

# loop to print all the data
dogData = np.array(ans)
FdogData = dogData[:, 1:]
FdogData.astype(float)
x, y =FdogData[:,0], FdogData[:,1]

# Reshaping
x, y = x.reshape(-1,1), y.reshape(-1, 1)

# Linear Regression Object 
lin_regression = LinearRegression()

# Fitting linear model to the data
lin_regression.fit(x,y)

# Get slope of fitted line
m = lin_regression.coef_

# Get y-Intercept of the Line
b = lin_regression.intercept_

# Get Predictions for original x values
# you can also get predictions for new data
predictions = lin_regression.predict(x)
chi= chisquare(predictions, y)

# following slope intercept form 
print ("formula: y = {0}x + {1}".format(m, b)) 
print(chi)

# Plot the Original Model (Black) and Predictions (Blue)
plt.scatter(x, y,  color='black')
plt.plot(x, predictions, color='blue',linewidth=3)
plt.show()

应该很容易解决，但我一点儿也不懂。

获取错误：

('d', '-72.70', '3.20')
('d', '-74.81', '2.00')
('d', '-87.60', '5.50')
('d', '-91.38', '2.00')
('d', '-71.80', '2.00')
('d', '-73.10', '2.00')
('d', '-81.20', '2.00')
('d', '-81.40', '2.00')
('d', '-75.70', '5.70')
('d', '-83.50', '5.10')
('d', '-73.90', '2.00')
('d', '-82.60', '2.00')
('d', '-77.30', '2.00')
('d', '-85.10', '2.00')
('d', '-79.70', '2.00')
('d', '-78.70', '2.00')
('d', '-77.90', '2.00')
('d', '-76.80', '2.00')
('d', '-83.80', '2.00')
('d', '-83.90', '2.00')
('d', '-82.00', '4.90')
('d', '-80.00', '4.80')

据我所知，脚本无法将字母“ d”隐藏为浮点，因为它是字母而不是数字。

导入后如何忽略数据中的第一列？确定已对它进行切片。我只希望能够创建一个具有第2列和第3列的数组并将其用于数据分析/绘图

我的Python项目中的转换错误

0 个答案: