我遇到了类似的问题,但无法解决我的问题。我的数据框的一部分看起来像这样:
Index Character Top 10 by edits Top 10 by added text
780 NaN Viradha David G Brault · 8 (40%) David G Brault · 1,915 (81.4%)
781 NaN Viradha Wiki-uk · 4 (20%) Risingstar12 · 213 (9.1%)
782 NaN Viradha Rich Farmbrough · 1 (5%) Woohookitty · 44 (1.9%)
783 NaN Viradha Woohookitty · 1 (5%) World8115 · 41 (1.7%)
784 NaN Viradha World8115 · 1 (5%) Rich Farmbrough · 33 (1.4%)
785 NaN Viradha 141.213.55.83 · 1 (5%) SmackBot · 31 (1.3%)
786 NaN Viradha Omnipaedista · 1 (5%) Citation bot 1 · 27 (1.1%)
787 NaN Viradha Jayarathina · 1 (5%) Omnipaedista · 20 (0.9%)
788 NaN Viradha Risingstar12 · 1 (5%) Wiki-uk · 17 (0.7%)
789 NaN Viradha 203.142.46.153 · 1 (5%) 203.142.46.153 · 11 (0.5%)
现在,我想通过匹配两个点之间的点(“ space-dot-space”)来拆分“通过编辑排名前10位”和“通过添加文本排名前10位”两列。要拆分第一列,我尝试过:
s = df["Top 10 by edits"].str.split(" . ", n = 1, expand = True)
df["Top 10 by edits"] = s[0]
df["Edits contribution"] = s[1]
但是,这将导致以下数据帧:
Index Character Top 10 by edits Top 10 by added text Edits contribution
780 NaN Viradha David David G Brault · 1,915 (81.4%) Brault · 8 (40%)
781 NaN Viradha Wiki-uk Risingstar12 · 213 (9.1%) 4 (20%)
782 NaN Viradha Rich Farmbrough Woohookitty · 44 (1.9%) 1 (5%)
783 NaN Viradha Woohookitty World8115 · 41 (1.7%) 1 (5%)
784 NaN Viradha World8115 Rich Farmbrough · 33 (1.4%) 1 (5%)
785 NaN Viradha 141.213.55.83 SmackBot · 31 (1.3%) 1 (5%)
786 NaN Viradha Omnipaedista Citation bot 1 · 27 (1.1%) 1 (5%)
787 NaN Viradha Jayarathina Omnipaedista · 20 (0.9%) 1 (5%)
788 NaN Viradha Risingstar12 Wiki-uk · 17 (0.7%) 1 (5%)
789 NaN Viradha 203.142.46.153 203.142.46.153 · 11 (0.5%) 1 (5%)
可以看出,第一行未在.
处拆分。我也尝试过\.
和r" . "
,但是我什么都没做。到底是什么问题?预先感谢。
答案 0 :(得分:2)
“按添加的文字排在前10位”列中的点不是句点,而是点字符,而您尝试在代码中按句点分隔。尝试更改一个或另一个以匹配。