我正在尝试使用NA
作为结果来表明
给定DataFrame“行”的计算值是没有意义的
(或者可能无法计算)。如何获得仍然响应NA
的计算dropna
的列?
示例:
using DataFrames
df = DataFrame(A = 1:4, B = [1, 0, 2, 3], C = [5, 4, 3, 3])
# A value of 0 in column B should yield a foo of NA
function foo(d)
if d[:B] == 0
return NA
end
return d[:B] ./ d[:C] # vectorized to work with `by`
end
# What I'm looking for is something equivalent to this list
# comprehension, but that returns a DataFrame or DataArray
# since normal Arrays don't respond to `dropna`
comprehension = [foo(frame) for frame in eachrow(df)]
答案 0 :(得分:2)
一种选择是扩展public class Program
{
public static void Main(string[] args)
{
try
{
throw new ConnectionLostException();
}
catch (Exception ex)
{
if (ex is LoginInfoException)
{
Console.WriteLine ("LoginInfoException");
}
else if (ex is ConnectionLostException)
{
Console.WriteLine ("ConnectionLostException");
}
}
}
}
public class LoginInfoException : WebException
{
public String Message { get; set; }
}
public class ConnectionLostException : WebException
{
public String Message { get; set; }
}
和Base.convert
,以便DataArrays.dropna
可以处理正常的dropna
:
Vector
现在示例应该按预期工作:
using DataFrames
function Base.convert{T}(::Type{DataArray}, v::Vector{T})
da = DataArray(T[],Bool[])
for val in v
push!(da, val)
end
return da
end
function DataArrays.dropna(v::Vector)
return dropna(convert(DataArray,v))
end
即使没有扩展的df = DataFrame(A = 1:4, B = [1, 0, 2, 3], C = [5, 4, 3, 3])
# A value of 0 in column B should yield a foo of NA
function foo(d)
if d[:B] == 0
return NA
end
return d[:B] / d[:C]
end
comprehension = [foo(frame) for frame in eachrow(df)]
dropna(comprehension) #=> Array{Any,1}: [0.2, 0.667, 1.]
,扩展的dropna
也允许将理解作为新的DataArray列插入到DataFrame中,保留convert
及其适当的删除行为:
NA
答案 1 :(得分:1)
这有点棘手,因为数据帧行是不方便的对象。例如,我认为这是完全合理的:
using DataFrames
df = DataFrame(A = 1:4, B = [1, 0, 2, 3], C = [5, 4, 3, 3])
# A value of 0 in column B should yield a foo of NA
function foo(d)
if d[:B] == 0
return NA
end
return d[:B] / d[:C] # vectorized to work with `by`
end
comp = DataArray(Float64,4)
map!(r->foo(r), eachrow(df))
但这会导致
`map!` has no method matching map!(::Function, ::DFRowIterator{DataFrame})
但是,如果你只是想做一个并不总是返回一行的by
那么你可以这样做:
using DataFrames
df = DataFrame(A = 1:4, B = [1, 0, 2, 3], C = [5, 4, 3, 3])
# A value of 0 in column B returns an empty array
function foo(d)
if d[1,:B] == 0
return []
end
return d[1,:B] / d[1,:C] #Plan on only getting a single row in the by
end
by(df, [:A,:B,:C]) do d
foo(d)
end
导致
3x4 DataFrame
| Row | A | B | C | x1 |
|-----|---|---|---|----------|
| 1 | 1 | 1 | 5 | 0.2 |
| 2 | 3 | 2 | 3 | 0.666667 |
| 3 | 4 | 3 | 3 | 1.0 |