用Pandas数据帧列填充前导零,数据为numpy数组

时间:2018-01-07 08:32:13

标签: python pandas

我有一个pandas数据帧df,我用下面的代码来创建数据帧

           A
 [11234466, 77777777, 12345678, 23452345]
 [99999999, 66666666, 44332211, 56781234]

其中一列,A列包含numpy数组。数据类型为对象。列A看起来像

      A
['0011234466', '0077777777', '0012345678', '0023452345']
['0099999999', '0066666666', '0044332211', '0056781234']

所有值都是整数,大多数是8位数。我想把它变成带有前导零的10位数字符串。喜欢这个

 df['A'] = df['A'].astype(str)
 df['A'] = df['A'].apply(lambda x: x.zfill(10))

我想尝试下面的代码

       A
0  [000000000[, 0000000001, 0000000001, 000000000...
0  [000000000[, 0000000009, 0000000009, 000000000... `

但是,这不会填充零,而是保持列不变。你能否建议我如何用前导零填充A列的值?

使用Jazrel的建议,我得到了这个输出

{
  "Comment": "state functionality",
  "StartAt": "FirstCall",
  "States": {
    "FirstCall": {
      "Type": "Choice",
      "Choices": [
        {
          "Not": {
            "Resource": "rnName",
            "Variable": "$.response",
            "InputPath": "$",
            "ResultPath": "$",
            "OutputPath": "$",
            "StringEquals": "Success Import"
          },
          "Next": "SecondCall"
        },
        {
          "Variable": "$.response",
          "StringEquals": "Success Import ",
          "Next": "ThirdCall"
        },
        {
          "And": [
            {
              "Variable": "$.response",
              "StringEquals": "Success Import"
            },
            {
              "Variable": "$.response",
              "StringEquals": "Success Import"
            }
          ],
          "Next": "FourthCall"
        }
      ]
    },
    "SecondCall": {
      "Type": "Task",
      "Resource": "rnName",
      "Next": "BeforeEnd"
    },
    "ThirdCall": {
      "Type": "Task",
      "Resource": "rnName",
      "Next": "BeforeEnd"
    },
    "FourthCall": {
      "Type": "Task",
      "Resource": "rnName",
      "Next": "BeforeEnd"
    },
    "BeforeEnd": {
      "Type": "Task",
      "Resource": "rnName",
      "End": true
    }
  }
}

1 个答案:

答案 0 :(得分:3)

我认为你可以使用list comprehension

df['A'] = df['A'].apply(lambda x: [str(y).zfill(10) for y in x])
print (df)
                                                  A
0  [0011234466, 0077777777, 0012345678, 0023452345]
0  [0099999999, 0066666666, 0044332211, 0056781234]
df['A'] = [[str(y).zfill(10) for y in x] for x in df['A']]
print (df)
                                                  A
0  [0011234466, 0077777777, 0012345678, 0023452345]
0  [0099999999, 0066666666, 0044332211, 0056781234]

format类似的解决方案:

df['A'] = [['{:010d}'.format(y) for y in x] for x in df['A']]
print (df)
                                                  A
0  [0011234466, 0077777777, 0012345678, 0023452345]
0  [0099999999, 0066666666, 0044332211, 0056781234]

编辑:

data = np.array([[11234466], [77777777], [12345678], [23452345]])
data1 = np.array([[99999999], [66666666], [44332211], [56781234]])
df=pd.DataFrame({'A' : [data.tolist()]})
df1=pd.DataFrame({'A' : [data1.tolist()]})
df=pd.concat([df,df1])

df['A'] = [[['{:010d}'.format(z) for z in y] for y in x] for x in df['A']]
print (df)
                                                   A
0  [[0011234466], [0077777777], [0012345678], [00...
0  [[0099999999], [0066666666], [0044332211], [00...