Question

我想知道如何消除Julia数组（1D）中的任何元素，如下所示。它是通过阅读文本文件而构建的，其中没有相关信息的行与具有相关信息的行混合在一起。 “没有”是类型Void，我想清理所有它的数组。

nothing                                                                                                                                 
nothing                                                                                                                                 
nothing                                                                                                                                 
nothing                                                                                                                                 
nothing                                                                                                                                 
"   -16.3651\t     0.1678\t    -4.6997\t   -14.0152\t    -2.6855\t   -16.0294\t    -7.8049\t   -27.1912\t    -5.0354\t   -14.5187\t\r\n"
"   -16.4490\t    -1.0910\t    -3.6087\t   -12.6724\t    -1.5945\t   -14.7705\t    -7.2174\t   -25.2609\t    -3.7766\t   -14.3509\t\r\n"
"   -16.4490\t    -2.2659\t    -2.4338\t   -10.9100\t    -0.5875\t   -13.6795\t    -6.7139\t   -22.9950\t    -2.9373\t   -14.0991\t\r\n"

Answer 1

testvector[testvector.!=nothing]也是一个非常易读的选项。

基准测试有助于选择最有效的代码。

Answer 2

一种简单的方法是使用filter!函数更新您的矢量：

testvector=[fill(nothing,10) ; [1,2,3]];
# =>13-element Array{Any,1}:
#    nothing
#    nothing
#    nothing
#    nothing
#    nothing
#    nothing
#    nothing
#    nothing
#    nothing
#    nothing
#    1
#    2
#    3

filter!(x->x!=nothing, testvector)
# => 3-element Array{Any,1}:
#     1
#     2
#     3

感谢@Daniel Arndt

编辑，请参阅Julia doc的这一段：

nothing是一个特殊值，不会打印任何内容互动提示。除了不打印，这是完全正常的值，您可以通过编程方式对其进行测试。

我认为以下所有条件都会给我们带来相同的结果

x!=nothing
x!==nothing
!is(x,nothing)
!isa(x,Void)
typeof(x)!=Void

Answer 3

你是怎么读那个文件的？

您可以从数组中过滤掉nothing：

filter(x -> !is(nothing, x), [nothing, 42])    # => Any[42]

但您可能希望先使用tsv（制表符分隔值）文件清理数据，如下所示：

-16.3651    0.1678  -4.6997 -14.0152    -2.6855 -16.0294    -7.8049 -27.1912    -5.0354 -14.5187
-16.4490    -1.0910 -3.6087 -12.6724    -1.5945 -14.7705    -7.2174 -25.2609    -3.7766 -14.3509
-16.4490    -2.2659 -2.4338 -10.9100    -0.5875 -13.6795    -6.7139 -22.9950    -2.9373 -14.0991

使用readdlm：

julia> readdlm("data.tsv")
3x10 Array{Float64,2}:
 -16.3651   0.1678  -4.6997  -14.0152  …  -27.1912  -5.0354  -14.5187
 -16.449   -1.091   -3.6087  -12.6724     -25.2609  -3.7766  -14.3509
 -16.449   -2.2659  -2.4338  -10.91       -22.995   -2.9373  -14.0991

使用DataFrmaes.readtable：

julia> df = readtable("data.tsv");

julia> names!(df, [symbol(x) for x in 'A':'J'])
2x10 DataFrames.DataFrame
| Row | A       | B       | C       | D        | E       | F        | G       |
|-----|---------|---------|---------|----------|---------|----------|---------|
| 1   | -16.449 | -1.091  | -3.6087 | -12.6724 | -1.5945 | -14.7705 | -7.2174 |
| 2   | -16.449 | -2.2659 | -2.4338 | -10.91   | -0.5875 | -13.6795 | -6.7139 |

| Row | H        | I       | J        |
|-----|----------|---------|----------|
| 1   | -25.2609 | -3.7766 | -14.3509 |
| 2   | -22.995  | -2.9373 | -14.0991 |

Answer 4

亲爱的，

最后，代码变成了这样：

tmpFile=open(fileName)
tmp=readdlm(tmpFile);
ind=pmap(typeof,tmp[:,1]).!=SubString{ASCIIString}; # if the first column typeof is string, than pmap will return false, else, it return true. This will provide an index of valid/not valid rows.
tmpClean=tmp[ind,:]; # only valid rows will be used

如果您有任何改进建议，我将不胜感激。谢谢你的帮助。

如何在Julia中消除数组中的任何元素（1D）？

4 个答案: