Question

我确实有一个由空格分隔的csv文件，如下所示：

5.64E-4   0.1259   3.556E-4   300
2.98E-4   4.7E-3   5.322E-4   270

我这样的熊猫

df1 = pandas.read_csv(filepath[0], header=None, delim_whitespace=True, lineterminator='\r')

但我意识到pandas将DataFrame保存为String，因为它不知道E的含义。我可以以某种方式导入csv文件并将其转换为数字写入，所以我可以绘制它吗？

Answer 1

使用以下命令强制将这些值推断为read：

上的浮点数

import pandas
import numpy as np

pandas.read_csv(filepath[0], header=None,
                delim_whitespace=True, lineterminator='\r',
                dtype=np.float64)

这适用于大写字母'E'。

示例

pd.DataFrame({'a':['5.64E-4', '0.1259', '3.556E-4'],
              'b':['a', 'b', 'c']}, dtype=np.float64)

<强>输出

          a  b
0  0.000564  a
1  0.125900  b
2  0.000356  c

Answer 2

在我看来，问题应该是一些不是数值。

可能的解决方案是使用to_numeric与errors='coerce'一起解析非NaN Series print (df) 0 1 2 3 0 5.64E-4 0.1259 3.556E-4 300 1 2.98E-4 4.7E-3 AAA 270 df = df.apply(pd.to_numeric, errors='coerce') print (df) 0 1 2 3 0 0.000564 0.1259 0.000356 300 1 0.000298 0.0047 NaN 270，因为它仅适用于一列（contain）：

cover

Answer 3

因为对我来说其他方法不起作用，没有将所有内容解析为 NaN，我发布了另一种阅读这种科学记数法变体的方法。

# all lines will be interpreted as strings for the asked notation
data = pd.read_csv(file_path)
# replace the notation across the whole dataframe
data = data.replace('E', 'e', regex=True).replace(',', '.', regex=True)
# convert notation to the one pandas allows
data = data.apply(pd.to_numeric, args=('coerce',))

这可能不是一个很好的pythonic方式，但它对我有用

如何用Python中的大写字母E用科学记数法读取csv？

3 个答案: