我想知道有没有更快的方法可以解决此问题而无需使用for循环?
输入数据框如下所示:
bar
我希望输出看起来像这样:
newObj()
using (PdfReader pdfReader = new PdfReader(source, null))
{
using (FileStream outputStream = new FileStream(signedPdfPath, FileMode.Create))
{
PdfStamper pdfStamper = PdfStamper.CreateSignature(pdfReader, outputStream, '\0', System.IO.Path.GetTempFileName(), true);
PdfSignatureAppearance appearance = pdfStamper.SignatureAppearance;
appearance.ReasonCaption = "Contact:";
appearance.Reason = "Add signature";
appearance.Location = "Viet Nam";
var page = 1;
appearance.SetVisibleSignature(new Rectangle(0, 0, 160, 55), page, "sign" + dateSign);
// modify text
StringBuilder buf = new StringBuilder();
buf.Append("Signature Valid\n");
buf.Append("Ký bởi: ");
String name = "Một triệu ba trăm hai mươi bốn nghìn một trăm ba mươi hai vnd";
buf.Append(name).Append('\n');
buf.Append("Ngày ký: ").Append(DateTime.Now.ToString("dd/MM/yyyy"));
string text = buf.ToString();
appearance.Layer2Text = text;
string fullPathAppOfCurrentUser = HttpContext.Current.Server.MapPath("");
var FontColour = new BaseColor(0, 0, 255);
BaseFont bf = BaseFont.CreateFont(fullPathAppOfCurrentUser, BaseFont.IDENTITY_H, BaseFont.EMBEDDED);
appearance.Layer2Font = new Font(bf, 10, Font.BOLD, FontColour);
appearance.SignatureRenderingMode = PdfSignatureAppearance.RenderingMode.DESCRIPTION;
BcX509.X509Certificate bcCert = DotNetUtils.FromX509Certificate(Cert);
var chain = new List<BcX509.X509Certificate> { bcCert };
IExternalSignature pks = new X509Certificate2Signature(Cert, "SHA1");
MakeSignature.SignDetached(appearance, pks, chain, null, null, null, 0, CryptoStandard.CMS);
pdfStamper.Dispose();
}
}
和 0 1 2 3 4 5 6
0 x x 1 NaN NaN NaN NaN
1 x y 1 NaN NaN NaN NaN
2 y y 4 4 4 4 4
3 y z 5 2 7 4 0
4 x x NaN 5 7 4 9
5 x y NaN 9 4 5 10
是一些信息。如果我们将这两列作为一个信息,这两列将没有NaN。
此数据框可能非常大,我不知道数据丢失的地方。
答案 0 :(得分:3)
如果每个组需要第一个非df1 = df.groupby([0,1], as_index=False).first()
print (df1)
0 1 2 3 4 5 6
0 x x 1.0 5.0 7.0 4.0 9.0
1 x y 1.0 9.0 4.0 5.0 10.0
2 y y 4.0 4.0 4.0 4.0 4.0
3 y z 5.0 2.0 7.0 4.0 0.0
值,请使用GroupBy.first
:
print (df)
0 1 2 3 4 5 6
0 x x 10.0 NaN NaN NaN NaN
1 x x 20.0 NaN NaN NaN NaN
2 x x 1.0 NaN NaN NaN NaN
3 x y 1.0 NaN NaN NaN NaN
4 y y 4.0 4.0 4.0 4.0 4.0
5 y z 5.0 2.0 7.0 4.0 0.0
6 x x NaN 5.0 7.0 4.0 9.0
7 x x NaN 50.0 70.0 4.0 9.0
8 x y NaN 9.0 4.0 5.0 10.0
df1 = df.groupby([0,1], as_index=False).first()
print (df1)
0 1 2 3 4 5 6
0 x x 10.0 5.0 7.0 4.0 9.0
1 x y 1.0 9.0 4.0 5.0 10.0
2 y y 4.0 4.0 4.0 4.0 4.0
3 y z 5.0 2.0 7.0 4.0 0.0
如果更多的行每组没有NaN,则可能会丢失一些数据:
def f(x):
df1 = pd.DataFrame({y: pd.Series(x[y].dropna().values) for y in x})
return (df1)
df = df.set_index([0,1]).groupby([0,1]).apply(f).reset_index(level=2, drop=True).reset_index()
print (df)
0 1 2 3 4 5 6
0 x x 10.0 5.0 7.0 4.0 9.0
1 x x 20.0 50.0 70.0 4.0 9.0
2 x x 1.0 NaN NaN NaN NaN
3 x y 1.0 9.0 4.0 5.0 10.0
4 y y 4.0 4.0 4.0 4.0 4.0
5 y z 5.0 2.0 7.0 4.0 0.0
具有自定义功能的可能解决方案:
{{1}}