Question

我正在尝试使用两个参数定义一个函数：df（dataframe）和一个整数（employerID）作为我的参数。此功能将返回雇主的全名。

如果给定的ID不属于任何员工，我想返回字符串“UNKNOWN”/如果没有给出中间名，则只返回“LAST，FIRST”。 /如果只给出中间的首字母，则返回“LAST，FIRST M.”格式的全名。中间的首字母后跟一个'。'。

def getFullName(df, int1):
    df = pd.read_excel('/home/data/AdventureWorks/Employees.xls')
    newdf = df[(df['EmployeeID'] == int1)]
    print("'" + newdf['LastName'].item() + "," + " " + newdf['FirstName'].item() + " " + newdf['MiddleName'].item() + "." + "'")

getFullName('df', 110)

我写了这段代码，但提出了两个问题： 1）如果我没有在df周围加上引号，它会给我一个错误信息，但我只是想把数据框作为参数而不是字符串。

2）此代码无法处理具有中间名称的人。

很抱歉，我使用pd.read_excel来读取您无法访问的excel文件。我知道你很难在不使用excel文件的情况下测试代码，如果有人让我知道如何用列名创建一个随机数据框，我会继续改变它。谢谢，

Answer 1

我为此创建了一些假数据：

           EmployeeID FirstName LastName MiddleName
0          0         a        a          a
1          1         b        b          b
2          2         c        c          c
3          3         d        d          d
4          4         e        e          e
5          5         f        f          f
6          6         g        g          g
7          7         h        h          h
8          8         i        i          i
9          9         j        j       None

EmployeeID 9没有中间名，但其他人都有。我这样做的方法是将逻辑分为两部分。第一个，因为你找不到EmployeeID。第二个管理员工姓名的打印。第二部分还应该有两组逻辑，一组用于控制员工是否具有中间名，另一组用于控制员工是否具有中间名。你可能会把很多这些结合成单行语句，但你可能会牺牲清晰度。

我还从该功能中删除了pd.read_excel来电。如果要将数据帧传递给函数，则应该创建数据帧。

def getFullName(df, int1):
   newdf = df[(df['EmployeeID'] == int1)]

   # if the dataframe is empty, then we can't find the give ID
   # otherwise, go ahead and print out the employee's info
   if(newdf.empty):
       print("UNKNOWN")
       return "UNKNOWN"
   else:
       # all strings will start with the LastName and FirstName
       # we will then add the MiddleName if it's present
       # and then we can end the string with the final '
       s = "'" + newdf['LastName'].item() + ", " +newdf['FirstName'].item()
       if (newdf['MiddleName'].item()):
           s = s + " " + newdf['MiddleName'].item() + "."
       s = s + "'"
       print(s)
       return s

我有函数返回值，以防你想进一步操作字符串。但那只是我。

如果您运行getFullName(df, 1)，则应获得'b, b b.'。对于getFullName(df, 9)，您应该获得'j, j'。

所以完整，它将是：

df = pd.read_excel('/home/data/AdventureWorks/Employees.xls')
getFullName(df, 1)  #outputs 'b, b b.'
getFullName(df, 9)  #outputs 'j, j'
getFullName(df, 10) #outputs UNKNOWN

虚假数据：

d = {'EmployeeID' : [0,1,2,3,4,5,6,7,8,9],
     'FirstName' : ['a','b','c','d','e','f','g','h','i','j'],
     'LastName' : ['a','b','c','d','e','f','g','h','i','j'],
     'MiddleName' : ['a','b','c','d','e','f','g','h','i',None]}
df = pd.DataFrame(d)

如何从数据框中获取一个字符串

1 个答案: