如何修复“ AttributeError:'RDD'对象没有属性'rfind'”?

时间:2019-05-02 16:25:32

标签: python email pyspark

我正在处理一个代码,该代码附加了HDFS中的文件并发送电子邮件。我已经获得了处理本地文件夹(Linux主目录)中文件的代码,但是当我将附件的位置更改为HDFS位置时,我得到AttributeError:'RDD'对象没有属性'rfind'错误。有人可以帮忙吗?

我将编码更改为

{_id: req.params.id.toString()}

并尝试了

part = MIMEApplication("".join(f.collect()).encode('utf-8').strip(), Name=basename(f))

但仍然出现相同的错误

这是我的代码

part = MIMEApplication(u"".join(f.collect()), Name=basename(f))

错误:

import smtplib
from os.path import basename
from email.mime.application import MIMEApplication
from email.mime.multipart import MIMEMultipart
from email.mime.text import MIMEText
from email.utils import COMMASPACE, formatdate

def success_mail():
    sender = "no-reply@company.com"
    receivers = 'user@company.com'
    msg = MIMEMultipart()
    msg.attach(MIMEText("Scoring completed. Attached is the latest report"))
    f=sc.textFile("/user/userid/folder/report_20190501.csv")
    part = MIMEApplication("".join(f.collect()).encode('utf-8', 'ignore'), Name=basename(f))
    part['Content-Disposition'] = 'attachment; filename="%s"' % basename(f)
    msg.attach(part)
    try:
        smtp = smtplib.SMTP('smtp.company.com')
        smtp.sendmail(sender, receivers, msg.as_string())  
        smtp.close()
        logMessage("INFO - Successfully sent email with Attachment")
    except:
        emsg =  traceback.format_exc()
        logMessage("ERROR -  Unable to send email because of :"+emsg)

1 个答案:

答案 0 :(得分:1)

尝试替换这3行:

f=sc.textFile("/user/userid/folder/report_20190501.csv")
part = MIMEApplication("".join(f.collect()).encode('utf-8', 'ignore'), Name=basename(f))
part['Content-Disposition'] = 'attachment; filename="%s"' % basename(f)

使用:

file_path="/user/userid/folder/report_20190501.csv"
f=sc.textFile(file_path)
part = MIMEApplication("".join(f.collect()).encode('utf-8', 'ignore'), Name=basename(file_path))
part['Content-Disposition'] = 'attachment; filename="%s"' % basename(file_path)