Python numpy数组中的非常长的字符串

时间:2016-11-23 13:28:58

标签: python arrays csv numpy

我使用Name; Birthdate; Biography John; 1990; Lorem ipsum dolor sit amet, consectetur adipiscing elit. Hanc ergo intuens debet institutum illud quasi signum absolvere. Scrupulum, inquam, abeunti; Quae diligentissime contra Aristonem dicuntur a Chryippo. Quo tandem modo?

从csv文件中获取数据

表格如下所示:

[yourButton setBackgroundImage:[UIImage imageNamed:@"yourImageName"] forState:UIControlStateNormal];

Python和numpy似乎对这个长字符串有问题。 有什么想法可以修复吗?

4 个答案:

答案 0 :(得分:1)

您可以使用Python的pandas包。

以下是使用它的简单想法:

import pandas as pd

data = pd.read_csv("file.csv", delimiter = ";")

希望这就是你想要的......

答案 1 :(得分:0)

请使用pandas包来读取csv

Route::resource('myroute', 'myDearController');

Pandas也可以处理长字符串。

答案 2 :(得分:0)

我没有阅读它的问题,所以也许你的问题可能是以适合打印的方式格式化它。这里有几个选择。

>>> import textwrap
>>> a = "Lorem ipsum dolor sit amet, consectetur adipiscing elit. Hanc ergo intuens debet institutum illud quasi signum absolvere. Scrupulum, inquam, abeunti; Quae diligentissime contra Aristonem dicuntur a Chryippo. Quo tandem modo?"
>>> txt = textwrap.wrap(a, width=70)
>>> print(("{}\n"*len(txt)).format(*txt))
Lorem ipsum dolor sit amet, consectetur adipiscing elit. Hanc ergo
intuens debet institutum illud quasi signum absolvere. Scrupulum,
inquam, abeunti; Quae diligentissime contra Aristonem dicuntur a
Chryippo. Quo tandem modo?

或许这个......

>>> txt2 = "\n".join([i for i in txt])
>>> print(txt2)
Lorem ipsum dolor sit amet, consectetur adipiscing elit. Hanc ergo
intuens debet institutum illud quasi signum absolvere. Scrupulum,
inquam, abeunti; Quae diligentissime contra Aristonem dicuntur a
Chryippo. Quo tandem modo?
>>>     

答案 3 :(得分:0)

错误是:

In [67]: np.recfromtxt('stack40765849.txt', delimiter=';', dtype=str)
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-67-eab6d3192d4d> in <module>()
----> 1 np.recfromtxt('stack40765849.txt', delimiter=';', dtype=str)

/usr/lib/python3/dist-packages/numpy/lib/npyio.py in recfromtxt(fname, **kwargs)
   1949     kwargs.setdefault("dtype", None)
   1950     usemask = kwargs.get('usemask', False)
-> 1951     output = genfromtxt(fname, **kwargs)
   1952     if usemask:
   1953         from numpy.ma.mrecords import MaskedRecords
...
ValueError: Some errors were detected !
    Line #2 (got 4 columns instead of 3)

(注意,recfromtxt正在使用genfromtxt,我们对此进行了大量讨论。

问题不在于字符串的长度;它与分隔符的数量有关。第一行(标题?)有2,表示您需要3列或字段。但第二行有3个;额外可能是文本的一部分。

识别第一行的字段名称会产生相同的错误。

np.recfromtxt('stack40765849.txt', delimiter=';', dtype=str,names=True)

pandas将其加载为:

In [74]: data=pandas.read_csv('stack40765849.txt',delimiter=';')
In [75]: data
Out[75]: 
      Name                                          Birthdate  \
John  1990   Lorem ipsum dolor sit amet, consectetur adipi...   

                                              Biography  
John   Quae diligentissime contra Aristonem dicuntur...  

它不会出错,但看起来不正确。

==================

如果我将文字中的;更改为.

In [82]: np.genfromtxt('stack40765849_1.txt', delimiter=';', dtype=None,names=Tr
    ...: ue)
Out[82]: 
array((b'John', 1990, b' Lorem ipsum dolor sit amet, consectetur adipiscing elit. Hanc ergo intuens debet institutum illud quasi signum absolvere. Scrupulum, inquam, abeunti. Quae diligentissime contra Aristonem dicuntur a Chryippo. Quo tandem modo?'), 
      dtype=[('Name', 'S4'), ('Birthdate', '<i4'), ('Biography', 'S225')])

我得到一个带有3个字段的结构化数组(几乎像一个重新组合);最后很长 - 全文。 (b'...'表示Py3中的字节字符串;它不会出现在Py2显示中。)

pandas产生类似的东西:

In [83]: data=pandas.read_csv('stack40765849_1.txt',delimiter=';')
In [84]: data
Out[84]: 
   Name   Birthdate                                          Biography
0  John        1990   Lorem ipsum dolor sit amet, consectetur adipi...

纠正Py3 unicode加载:

In [91]: np.recfromtxt('stack40765849_1.txt', delimiter=';', dtype='U4,i,U255',n
    ...: ames=True)
Out[91]: 
rec.array(('John', 1990, ' Lorem ipsum dolor sit amet, consectetur adipiscing elit. Hanc ergo intuens debet institutum illud quasi signum absolvere. Scrupulum, inquam, abeunti. Quae diligentissime contra Aristonem dicuntur a Chryippo. Quo tandem modo?'), 
          dtype=[('Name', '<U4'), ('Birthdate', '<i4'), ('Biography', '<U255')])
In [92]: