PostgreSQL:在表

时间:2016-12-15 15:14:48

标签: sql windows postgresql dataset sql-order-by

我在Windows 7笔记本电脑上使用PostgreSQL 9.6.1来编译和分析来自不同来源的大型数据集。我的一位客户注意到,在我提供给他们的最终报告中,她所在州的一些人被归入其他州。

对于此报告,我使用以下命令创建最终表:

CREATE UNLOGGED TABLE LPIS_IssuanceDetail (
  ID SERIAL PRIMARY KEY,
  Zone TEXT DEFAULT NULL,
  State TEXT DEFAULT NULL,
  LastName TEXT DEFAULT NULL,
  FirstName TEXT DEFAULT NULL,
  Email TEXT DEFAULT NULL,
  UPN TEXT DEFAULT NULL,
  LincPassUsed TEXT DEFAULT NULL,
  EmployeeID TEXT DEFAULT NULL,
  EmploymentType TEXT DEFAULT NULL,
  NonEmployeeCategory TEXT DEFAULT NULL,
  EmploymentStatus TEXT DEFAULT NULL,
  ISAComplete TEXT DEFAULT NULL,
  ISACompletionDate TIMESTAMP WITHOUT TIME ZONE,
  LincPassStatus TEXT DEFAULT NULL,
  ERO TEXT DEFAULT NULL,
  Sponsored TEXT DEFAULT NULL,
  Enrolled TEXT DEFAULT NULL,
  Adjudicated TEXT DEFAULT NULL,
  ShipToSite TEXT DEFAULT NULL,
  ValidSite TEXT DEFAULT NULL,
  CardExpiration DATE,
  CertExpiration DATE,
  LastEnrollment DATE,
  EnrollmentExpiration DATE,
  NewEnrollment TEXT DEFAULT NULL,
  Sponsor TEXT DEFAULT NULL,
  ContractEnd DATE,
  ContractID TEXT DEFAULT NULL,
  ContractPOC TEXT DEFAULT NULL
);

然后,我使用主数据表中的数据填充此表:

INSERT INTO LPIS_IssuanceDetail (
  Zone, State, LastName, FirstName, Email, UPN, LincPassUsed, EmployeeID,
  EmploymentType, NonEmployeeCategory, EmploymentStatus, ISAComplete,
  ISACompletionDate, LincPassStatus, ERO, Sponsored, Enrolled, Adjudicated,
  ShipToSite, ValidSite, CertExpiration, LastEnrollment, EnrollmentExpiration,
  CardExpiration, NewEnrollment, Sponsor, ContractEnd, ContractID, ContractPOC
)
SELECT
  Zone, StateName, MAS_LastName, MAS_FirstName, MAS_Email, MAS_UPN,
  LincPassUsed, MAS_EmployeeID, MAS_Category, MAS_OrgRelType,
  MAS_EmploymentStatus, ISAComplete, ISA_CompletionDate, MAS_IssuanceStatus,
  MAS_FedEmerResponse, Sponsored, Enrolled, Adjudicated, MAS_ShipToCityState,
  MAS_ValidShipToSite, MAS_CertExpireDate, MAS_LastEnrollmentDate, MAS_EnrollExpireDate,
  MAS_CardExpireDate, MAS_NewEnrollment, MAS_Sponsor, MAS_PeriodofPerformanceEndDate,
  MAS_ContractID, MAS_ContractPOC
FROM LPIS_MasterData
ORDER BY Zone, StateName, MAS_LastName, MAS_FirstName;

可以肯定的是,当我向下滚动表格时,我发现单个记录穿插在序列之外,就像这个样本一样,来自缅因州的一条记录不合适:

id     | zone | state   | lastname | firstname
11849  | 3    | Georgia | Roberts  | George
11850  | 3    | Georgia | Smith    | Dan
11922  | 3    | Maine   | Edwards  | John
11851  | 3    | Georgia | Snowden  | Ed
11852  | 3    | Georgia | Williams | Casey

作为测试,我只将前四列转移到一个单独的表中,如下所示:

CREATE UNLOGGED TABLE LPIS_DetailTest (
  ID SERIAL PRIMARY KEY,
  Zone TEXT DEFAULT NULL,
  State TEXT DEFAULT NULL,
  LastName TEXT DEFAULT NULL,
  FirstName TEXT DEFAULT NULL
);

INSERT INTO LPIS_DetailTest (
  Zone, State, LastName, FirstName
)
SELECT
  Zone, State, LastName, FirstName
  FROM LPIS_IssuanceDetail
    ORDER BY Zone, State, LastName, FirstName;

所有行都按预期顺序排列:

id     | zone | state   | lastname | firstname
11849  | 3    | Georgia | Roberts  | George
11850  | 3    | Georgia | Smith    | Dan
11851  | 3    | Georgia | Snowden  | Ed
11852  | 3    | Georgia | Williams | Casey
11853  | 3    | Georgia | Spaid    | Dennis

为什么这个较小的表使用与较大的表相同的ORDER BY子句正确排序,其中某些行不按顺序排列?

数据库和所有表都设置为UTF8。

我已经查看了所有内容,并且无法弄清楚为什么ORDER BY子句会产生如此奇怪的结果。我还能检查什么?

修改:其他信息

在我的脚本中,立即INSERT INTO ... SELERCT ...语句之后,我使用COPY将数据转储到CSV文件,如下所示:

-- Export data to CSV files
COPY LPIS_IssuanceDetail (
    Zone, State, LastName, FirstName, Email, UPN, LincPassUsed, EmployeeID,
    EmploymentType, NonEmployeeCategory, EmploymentStatus, ISAComplete,
    ISACompletionDate, LincPassStatus, ERO, Sponsored, Enrolled, Adjudicated,
    ShipToSite, ValidSite, CertExpiration, LastEnrollment, EnrollmentExpiration,
    CardExpiration, NewEnrollment, Sponsor, ContractEnd, ContractID, ContractPOC
)
TO 'C:/Users/Michael.Sheaver/Documents/LincPass/Datasets/Compiled Reports/LPIS_IssuanceDetail.csv'
WITH (
    FORMAT CSV,
  DELIMITER ',',
    NULL '',
    HEADER TRUE,
  QUOTE '"',
    ENCODING 'UTF8'
);

然后,当我将此CSV文件导入电子表格进行最终演示时,我必须手动对ID列上的数据进行排序,然后删除该列。

新问题: 我是否可以在INSERT INTO语句中使用任何选项来严格保持行的顺序,以遵循ORDER BY子句中指定的内容?

1 个答案:

答案 0 :(得分:1)

如果您希望对CSV文件中的数据进行排序,请使用带有copy声明的select

COPY (select Zone, State, LastName, FirstName, Email, UPN, LincPassUsed, EmployeeID,
    EmploymentType, NonEmployeeCategory, EmploymentStatus, ISAComplete,
    ISACompletionDate, LincPassStatus, ERO, Sponsored, Enrolled, Adjudicated,
    ShipToSite, ValidSite, CertExpiration, LastEnrollment, EnrollmentExpiration,
    CardExpiration, NewEnrollment, Sponsor, ContractEnd, ContractID, ContractPOC
    from LPIS_IssuanceDetail 
    ORDER BY Zone, State, LastName, FirstName
)
TO 'C:/Users/Michael.Sheaver/Documents/LincPass/Datasets/Compiled Reports/LPIS_IssuanceDetail.csv'
WITH (FORMAT CSV, DELIMITER ',',  NULL '', HEADER TRUE, QUOTE '"', ENCODING 'UTF8');