检查object-dtype列值是float还是string的函数

时间:2018-11-21 18:26:09

标签: python python-3.x python-2.7 user-defined-types isinstance

我正在尝试编写一个等于Excel中isnumber [column]函数的函数

数据集:

feature1 feature2 feature3
  123       1.07     1
  231       2.08     3
  122        ab      4
  111       3.04     6
  555        cde     8

feature1: integer dtype
feature2: object dtype
feature3: integer dtype

我尝试了这段代码

for item in df.feature2.iteritems():
    if isinstance(item, float):
       print('yes')
    else:
       print('no')

我得到的结果是

 no
 no
 no
 no
 no

但是我希望结果为

yes
yes
no
yes
no

当我尝试检查单个feature2值的类型时,这就是

type(df.feature2[0]) = str
type(df.feature2[1]) = str
type(df.feature2[2]) = str
type(df.feature2[3]) = str
type(df.feature2[4]) = str

But clearly 0,1,3 should be shown as float, but they show up as str

我在做什么错?

5 个答案:

答案 0 :(得分:1)

Iteritems返回一个元组// draw a normal circle var normalCircle = new Path.Circle({ center: new Point(100, 100), radius: 50, fillColor: 'orange' }); // draw another circle that will have scale transformation reversed var notScalingCircle = new Path.Circle({ center: new Point(100, 100), radius: 30, fillColor: 'blue' }); // draw instructions new PointText({ content: 'press mouse button down to zoom in and see that blue circle size does not change', point: view.center + [0, -80], justification: 'center' }); function transformLayer(matrix) { // scale layer // project.activeLayer.applyMatrix = false; project.activeLayer.matrix = matrix; // just invert the scale and not all matrix notScalingCircle.scale(1 / matrix.scaling.x, 1 / matrix.scaling.y); } var matrix = new paper.Matrix( 2, 0, 0, 1.5, 50, 30 ); // on mouse down... function onMouseDown() { // ...scale up transformLayer(matrix); } // on mouse up... function onMouseUp() { // ...scale down transformLayer(matrix.clone().invert()); } ,由于您要遍历每个值,请尝试以下代码。 您只需要删除((123, '1.07'), 1.07),它将像超级按钮一样工作。

.iteritems()

这是您的输出:

df['feature2']=[1.07,2.08,'ab',3.04,'cde']
for item in df.feature2:
    if isinstance(item,float):
       print('yes')
    else:
       print('no')

答案 1 :(得分:1)

我认为您需要在这里考虑两件事:

  1. DictDataFrame的方法
  2. dtype(数组标量类型)与type(内置Python类型)之间的区别-参考(https://numpy.org/devdocs/reference/arrays.dtypes.html

要点1:

.iteritems() / .items()是用于字典的方法,而如果您正在处理dtypes(并根据提供的数据来判断),则很可能要遍历DataFrame,您无需使用.iteritems()方法就可以遍历每个值。旁注,.iteritems()已被Python淘汰,并由.items()取代(请参见讨论:When should iteritems() be used instead of items()?

要点2:

使用numpy或Pandas时,导入到DataFrame中的值的数据类型称为dtypes。这些需要与它们在Python中的直接比较(Python称为type)有所区别。您应该使用“ 熊猫数据类型”标题下的表格将dtype映射到type(参考:https://pbpython.com/pandas_dtypes.html

现在,针对您的问题,以下这段代码应该可以解决您的问题:

import pandas as pd

columns = ['feature1', 'feature2', 'feature3']
data = [[123, 1.07, 1],
        [231, 2.08, 3],
        [122, 'ab', 4],
        [111, 3.04, 6],
        [555, 'cde', 8]]

df = pd.DataFrame(data, columns=columns)

for value in df.feature2:
    if isinstance(value,float):
        print('yes')
    else:
        print('no')

答案 2 :(得分:0)

尝试一下:

\Badoo\SoftMocks::redefineFunction('strlen', '', 'return 5;');

答案 3 :(得分:0)

这是因为iteritems()返回一个元组(index, value)。 因此,您尝试例如检查(0, 1.07)(1, 2.08)的类型是否为float,但它们不是。

如果您将df.feature2.iteritems()更改为df.feature2.values:)

答案 4 :(得分:0)

您可以执行以下操作:

from pandas import DataFrame as df

columns = ['feature1', 'feature2', 'feature3']
data = [[123, 1.07, 1],
 [231, 2.08, 3],
 [122, 'ab', 4],
 [111, 3.04, 6],
 [555, 'cde', 8]]

df_ = df(data, columns=columns)
types = []
for k in df_:
    a = set(type(m) for m in df_[k])
    if len(a) > 1:
        types.append({k: 'object'})
    else:
        types.append({k: str(list(a)[0].__name__)})

print(types)

输出:

[{'feature1': 'int'}, {'feature2': 'object'}, {'feature3': 'int'}]