类似于R的bpa(基本模式分析)的Python包

时间:2019-08-28 11:45:48

标签: python r

确实很简单的问题:Python中是否有一个类似于R中的bpa软件包的软件包?

描述bpa功能的链接: Basic Pattern Analysis

我有一列包含混合数据,我想更好地理解数据的格式。 BPA提供以下格式(从我附加的链接中复制):

messy$Date %>%
  get_pattern %>%  # extract patterns
  table %>%        # tabulate frequencies
  as.data.frame    # display as a data frame

##                    . Freq
## 1         99/99/9999  262
## 2         9999-99-99  259
## 3          99Aaa9999  241
## 4  Aaaaaaaaaw99w9999   19
## 5   Aaaaaaaaw99w9999   56
## 6    Aaaaaaaw99w9999   45
## 7     Aaaaaaw99w9999   24
## 8      Aaaaaw99w9999   36
## 9       Aaaaw99w9999   42
## 10       Aaaw99w9999   16

1 个答案:

答案 0 :(得分:0)

我使用python创建了一个类似于BPA get_pattern函数的函数:

import re

def get_pattern(x, show_ws = True, ws_char = '<>'):
    
    if pd.isnull(x):
        x = np.nan
    else:
        if not isinstance(x, str):
            x = str(x)      

            x = re.sub("[a-z]", "a", x)
            x = re.sub("[A-Z]", "A", x)
            x = re.sub("[0-9]", "9", x)

        if isinstance(x, str):

            x = re.sub("[a-z]", "a", x)
            x = re.sub("[A-Z]", "A", x)
            x = re.sub("[0-9]", "9", x)

    
        if show_ws == True: 
            x = re.sub("\\s", ws_char, x)
  
    return(x)