我写了这个python脚本来搜索邮箱中看不到的邮件,下载xlsx附件,对其进行一些修改,然后将它们发布到另一个服务。 只需一个问题,一切都可以完美运行: 在原始的xlsx文件中,有一个名为“ zona”的列,其中包含该省的意大利语两个字母字符串。 如果此值为“ NA”(NAPLES省的值),则 保存结果xlsx文件的单元格为空,而不是NA。 NA是否是保留字?如果是,有没有办法引用它?
import os,email,imaplib,socket,requests
import pandas as pd
mail_user = os.environ.get('MAIL_USER')
mail_password = os.environ.get('MAIL_PASS')
mail_server = os.environ.get('MAIL_SERVER')
detach_dir = '.'
url=<removed url>
if mail_user is None or mail_password is None or mail_server is None:
print ('VARIABILI DI AMBIENTE NON DEFINITE')
exit(1)
try:
with imaplib.IMAP4_SSL(mail_server) as m:
try:
m.login(mail_user,mail_password)
m.select("INBOX")
resp, items = m.search(None, "UNSEEN")
items = items[0].split()
for emailid in items:
resp, data = m.fetch(emailid, "(RFC822)")
email_body = data[0][1] # getting the mail content
mail = email.message_from_bytes(email_body) # parsing the mail content to get a mail object
if mail.get_content_maintype() != 'multipart':
continue
for part in mail.walk():
if part.get_content_maintype() == 'multipart':
continue
if part.get('Content-Disposition') is None:
continue
filename = part.get_filename()
if filename.endswith('.xlsx'):
att_path = os.path.join(detach_dir, filename)
fp = open(att_path, 'wb')
fp.write(part.get_payload(decode=True))
fp.close()
xl = pd.ExcelFile(att_path)
df1 = xl.parse(sheet_name=0)
df1 = df1.replace({'\'':''}, regex=True)
df1.loc[df1['Prodotto'] == 'SP_TABLETA_SAMSUNG','Cod. ID.'] = 'X'
df1.loc[df1['Prodotto'] == 'AP_TLC','Cod. ID.'] = 'X'
df1.loc[df1['Prodotto'] == 'APDCMB00003','Cod. ID.'] = 'X'
df1.loc[df1['Prodotto'] == 'APDCMB03252','Cod. ID.'] = 'X'
writer = pd.ExcelWriter(att_path, engine='xlsxwriter')
df1.to_excel(writer, sheet_name='Foglio1', index=False)
writer.save()
uf = {'files': open(att_path, 'rb')}
http.client.HTTPConnection.debuglevel = 0
r = requests.post(url, files=uf)
print (r.text)
except imaplib.IMAP4_SSL.error as e:
print (e)
exit(1)
except imaplib.IMAP4.error:
print ("Errore di connessione al server")
exit(1)
答案 0 :(得分:0)
Pandas似乎将NA值视为NaN,因此,当您写入excel时,默认情况下会将其写为''
(请参阅docs)。
您可以将na_rep='NA'
传递给to_excel()
函数以将其写为字符串;
df1.to_excel(writer, sheet_name='Foglio1', index=False, na_rep='NA')
但是要注意,df
中存在的任何其他NaN值也会以'NA'的形式写入excel文件。
答案 1 :(得分:0)
阅读@Matt B的docs链接文章。我找到了此解决方案:
df1 = xl.parse(sheet_name=0, keep_default_na=False, na_values=['_'])
如果我很了解,只有_被解释为“不可用”