在命令行或Python中将CSV导入SQLite数据库

时间:2013-02-18 23:28:43

标签: python sqlite

在伯尼的要求下,我试图将其简化为一个更简单的例子:

我有一个CSV文件,其中包含一个月,其中星期几是列标题:

Monday,Tuesday,Wednesday,Thursday,Friday,Saturday,Sunday
1,2,3,4,5,6,7
8,9,10,11,12,13,14
15,16,17,18,19,20,21
22,23,24,25,26,27,28

在命令行中,我创建了一个SQLite表,days:

sqlite> CREATE TABLE days(
   ...> Monday int,
   ...> Tuesday int,
   ...> Wednesday int,
   ...> Thursday int,
   ...> Friday int,
   ...> Saturday int,
   ...> Sunday int
   ...> );

尝试从csv导入数据时,我得到的是:

sqlite> .import example.csv days
Error: example.csv line 1: expected 7 columns of data but found 1

如何将此csv文件导入数据库,以便识别每个新行?谢谢!

3 个答案:

答案 0 :(得分:1)

在执行.import命令之前,您需要包含以下行:

.separator ,

这告诉import命令查找分隔符(在本例中为逗号)。

您可以在此处找到有关sqlite命令行命令的更多信息:http://www.sqlite.org/sqlite.html

答案 1 :(得分:0)

SQLite shell相当精巧。当前版本不执行行标题,其行为在标准(RFC 4180)和常规实践之间存在差异。即将发布的3.8版本将执行行标题。

由于您使用的是Python,因此您可能会发现APSW Shell很有用(披露:我是作者)。您可以像命令行一样在SQLite shell中使用它,并且可以通过编程方式使用它,包括添加自己的命令。

值得注意的是,它有一个autoimport命令,可以解决所有问题,包括标题,分隔符,数据类型等。

sqlite> .help autoimport

.autoimport FILENAME ?TABLE?  Imports filename creating a table and
                              automatically working out separators and data
                              types (alternative to .import command)

The import command requires that you precisely pre-setup the table and schema,
and set the data separators (eg commas or tabs).  In many cases this information
can be automatically deduced from the file contents which is what this command
does.  There must be at least two columns and two rows.

If the table is not specified then the basename of the file will be used.

Additionally the type of the contents of each column is also deduced - for
example if it is a number or date.  Empty values are turned into nulls.  Dates
are normalized into YYYY-MM-DD format and DateTime are normalized into ISO8601
format to allow easy sorting and searching.  4 digit years must be used to
detect dates.  US (swapped day and month) versus rest of the world is also
detected providing there is at least one value that resolves the ambiguity.

Care is taken to ensure that columns looking like numbers are only treated as
numbers if they do not have unnecessary leading zeroes or plus signs.  This is
to avoid treating phone numbers and similar number like strings as integers.

This command can take quite some time on large files as they are effectively
imported twice.  The first time is to determine the format and the types for
each column while the second pass actually imports the data.

答案 2 :(得分:0)

查看termsql,它是为此目的而制作的工具。 你的任务很简单。

手动:http://tobimensch.github.io/termsql/

检查页面底部的示例。那里有一个CSV导入, 并检查不同的选项。

项目:https://github.com/tobimensch/termsql