我有一个非常稀疏的Pandas DataFrame,它有大约1,000行和大约10,000列。大多数行仅包含20-100个非零值。我现在想在每行中选择任意10个随机非零值,并将其值设置为0。
这是我的第一次尝试(非常不支持熊猫):
for i in range(df.shape[0]):
row = df.iloc[i]
nonZeros = np.where(row > 0)[0]
rand = np.random.choice(nonZeros, 10)
for j in rand:
df.iloc[i, j] = 0
答案 0 :(得分:0)
Something like this?
def setrandom(x):
counter=10
while counter>0:
randindex = np.random.randint(1,10000)
if x[randindex] !=0:
x[randindex] = 0
counter -=1
return x
df = df.apply(setrandom, axis=1)
this is not really an optimal way of doing it, especially since your dataframe is a sparse one!
答案 1 :(得分:0)
修改后的答案
您可以使用以下代码:
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:xs="http://www.w3.org/2001/XMLSchema" xmlns:m="http://jasperreports.sourceforge.net/jasperreports/print"
exclude-result-prefixes="xs m"
version="2.0">
<xsl:output indent="yes"/>
<xsl:template match="m:page">
<secondChildLis>
<xsl:for-each select="//m:secondChild">
<secondChild>
<xsl:apply-templates/>
</secondChild>
</xsl:for-each>
</secondChildLis>
</xsl:template>
</xsl:stylesheet>
You may also try it.
也许不是最快的方法,但对熊猫友好得多