对于missing
中的列,如何使用0.0
替换DataFrame
值?
答案 0 :(得分:3)
using DataFrames
a = @data [1.0,2.0, NA, 4.0] #Make a DataArray with an NA value
df = DataFrame(a=a) #Make a DataFrame from it
df[isna(df[:a]),:a] = 0.0 #Replace NAs in column a with 0.0
结果
4x1 DataFrames.DataFrame
| Row | a |
|-----|-----|
| 1 | 1.0 |
| 2 | 2.0 |
| 3 | 0.0 |
| 4 | 4.0 |
答案 1 :(得分:2)
使用df
s
NA
using DataFrames
df = DataFrame(A = 1.0:10.0, B = 2.0:2.0:20.0)
df[ df[:B] %2 .== 0, :A ] = NA
您会在NA
中看到一些df
...我们现在将它们转换为0.0
df[ isna(df[:A]), :A] = 0
EDIT = NaN
→NA
。谢谢@Reza
答案 2 :(得分:1)
其他答案都很不错。如果你是一个真正的速度垃圾,也许以下可能适合你:
# prepare example
using DataFrames
df = DataFrame(A = 1.0:10.0, B = 2.0:2.0:20.0)
df[ df[:A] %2 .== 0, :B ] = NA
df[:B].data[df[:B].na] = 0.0 # put the 0.0 into NAs
df[:B] = df[:B].data # with no NAs might as well use array
答案 3 :(得分:0)
这是自朱莉娅最近引入missing
属性以来的简短且更新的答案。
using DataFrames
df = DataFrame(A=rand(1:50, 5), B=rand(1:50, 5), C=vcat(rand(1:50,3), missing, rand(1:50))) ## Creating random 5 integers within the range of 1:50, while introducing a missing variable in one of the rows
df = DataFrame(replace!(convert(Matrix, df), missing=>0)) ## Converting to matrix first, since replacing values directly within type dataframe is not allowed
答案 4 :(得分:0)
从Julia 1.1开始,有几种解决此问题的方法。基本方法如下:
julia> using DataFrames
julia> df = DataFrame(a = [1, missing, missing, 4], b = 5:8)
4×2 DataFrame
│ Row │ a │ b │
│ │ Int64⍰ │ Int64 │
├─────┼─────────┼───────┤
│ 1 │ 1 │ 5 │
│ 2 │ missing │ 6 │
│ 3 │ missing │ 7 │
│ 4 │ 4 │ 8 │
julia> df.a[ismissing.(df.a)] .= 0
2-element view(::Array{Union{Missing, Int64},1}, [2, 3]) with eltype Union{Missing, Int64}:
0
0
julia> df
4×2 DataFrame
│ Row │ a │ b │
│ │ Int64⍰ │ Int64 │
├─────┼────────┼───────┤
│ 1 │ 1 │ 5 │
│ 2 │ 0 │ 6 │
│ 3 │ 0 │ 7 │
│ 4 │ 4 │ 8 │
但是,请注意,此时a
列的类型仍允许缺少值:
julia> typeof(df.a)
Array{Union{Missing, Int64},1}
打印数据框时,在Int64
列中a
后面的问号也表明了这一点。您可以使用disallowmissing!
来更改此设置:
julia> disallowmissing!(df, :a)
4×2 DataFrame
│ Row │ a │ b │
│ │ Int64 │ Int64 │
├─────┼───────┼───────┤
│ 1 │ 1 │ 5 │
│ 2 │ 0 │ 6 │
│ 3 │ 0 │ 7 │
│ 4 │ 4 │ 8 │
另一种方法是使用coalesce
:
julia> df = DataFrame(a = [1, missing, missing, 4], b = 5:8);
julia> df.a = coalesce.(df.a, 0)
4-element Array{Int64,1}:
1
0
0
4
julia> df
4×2 DataFrame
│ Row │ a │ b │
│ │ Int64 │ Int64 │
├─────┼───────┼───────┤
│ 1 │ 1 │ 5 │
│ 2 │ 0 │ 6 │
│ 3 │ 0 │ 7 │
│ 4 │ 4 │ 8 │
第三个选择是使用Missings.replace
软件包中的Missings
:
julia> using Missings
julia> df = DataFrame(a = [1, missing, missing, 4], b = 5:8);
julia> df.a .= Missings.replace(df.a, 0)
4-element Array{Union{Missing, Int64},1}:
1
0
0
4
对于其他语法,您可以尝试 DataFramesMeta 软件包:
julia> using DataFramesMeta
julia> df = DataFrame(a = [1, missing, missing, 4], b = 5:8);
julia> @transform(df, a .= coalesce.(:a, 0))
4×2 DataFrame
│ Row │ a │ b │
│ │ Int64 │ Int64 │
├─────┼───────┼───────┤
│ 1 │ 1 │ 5 │
│ 2 │ 0 │ 6 │
│ 3 │ 0 │ 7 │
│ 4 │ 4 │ 8 │