如何连接到Gmail并确定哪些邮件包含附件?然后,我想下载每个附件,在处理它时为每条消息打印出主题:和发件人:
答案 0 :(得分:151)
艰难的一个: - )
import email, getpass, imaplib, os
detach_dir = '.' # directory where to save attachments (default: current)
user = raw_input("Enter your GMail username:")
pwd = getpass.getpass("Enter your password: ")
# connecting to the gmail imap server
m = imaplib.IMAP4_SSL("imap.gmail.com")
m.login(user,pwd)
m.select("[Gmail]/All Mail") # here you a can choose a mail box like INBOX instead
# use m.list() to get all the mailboxes
resp, items = m.search(None, "ALL") # you could filter using the IMAP rules here (check http://www.example-code.com/csharp/imap-search-critera.asp)
items = items[0].split() # getting the mails id
for emailid in items:
resp, data = m.fetch(emailid, "(RFC822)") # fetching the mail, "`(RFC822)`" means "get the whole stuff", but you can ask for headers only, etc
email_body = data[0][1] # getting the mail content
mail = email.message_from_string(email_body) # parsing the mail content to get a mail object
#Check if any attachments at all
if mail.get_content_maintype() != 'multipart':
continue
print "["+mail["From"]+"] :" + mail["Subject"]
# we use walk to create a generator so we can iterate on the parts and forget about the recursive headach
for part in mail.walk():
# multipart are just containers, so we skip them
if part.get_content_maintype() == 'multipart':
continue
# is this part an attachment ?
if part.get('Content-Disposition') is None:
continue
filename = part.get_filename()
counter = 1
# if there is no filename, we create one with a counter to avoid duplicates
if not filename:
filename = 'part-%03d%s' % (counter, 'bin')
counter += 1
att_path = os.path.join(detach_dir, filename)
#Check if its already there
if not os.path.isfile(att_path) :
# finally write the stuff
fp = open(att_path, 'wb')
fp.write(part.get_payload(decode=True))
fp.close()
Wowww!那是件好事。 ;-)但是在Java中尝试相同,只是为了好玩!
顺便说一句,我在shell中测试过,所以可能会出现一些错误。
享受
修改强>
由于邮箱名称可能会从一个国家/地区更改为另一个国家/地区,因此我建议m.list()
执行m.select("the mailbox name")
并在{{1}}之前选择项目以避免此错误:
imaplib.error:命令SEARCH在状态AUTH中非法,只允许在 陈述SELECTED
答案 1 :(得分:9)
我不是Perl的专家,但我所知道的是,GMail支持IMAP和POP3,这两种协议是完全标准的,并允许您这样做。
也许这有助于您开始使用。
答案 2 :(得分:8)
#!/usr/bin/env python
"""Save all attachments for given gmail account."""
import os, sys
from libgmail import GmailAccount
ga = GmailAccount("your.account@gmail.com", "pA$$w0Rd_")
ga.login()
# folders: inbox, starred, all, drafts, sent, spam
for thread in ga.getMessagesByFolder('all', allPages=True):
for msg in thread:
sys.stdout.write('.')
if msg.attachments:
print "\n", msg.id, msg.number, msg.subject, msg.sender
for att in msg.attachments:
if att.filename and att.content:
attdir = os.path.join(thread.id, msg.id)
if not os.path.isdir(attdir):
os.makedirs(attdir)
with open(os.path.join(attdir, att.filename), 'wb') as f:
f.write(att.content)
未测试
答案 3 :(得分:7)
获得附件
有两种获取附件的方法:
1 - >通过发送对get_indv_email
# Creates an array of references to every attachment in your account
my $messages = $gmail->get_messages();
my @attachments;
foreach ( @{ $messages } ) {
my $email = $gmail->get_indv_email( msg => $_ );
if ( defined( $email->{ $_->{ 'id' } }->{ 'attachments' } ) ) {
foreach ( @{ $email->{ $_->{ 'id' } }->{ 'attachments' } } ) {
push( @attachments, $gmail->get_attachment( attachment => $_ ) );
if ( $gmail->error() ) {
print $gmail->error_msg();
}
}
}
}
2 - >或者通过发送附件ID和消息ID
#retrieve specific attachment
my $msgid = 'F000000000';
my $attachid = '0.1';
my $attach_ref = $gmail->get_attachment( attid => $attachid, msgid => $msgid );
(返回对包含附件中数据的标量的引用。)
答案 4 :(得分:4)
在gmail中,您可以过滤“has:attachment”,使用它来识别测试时应该收到的消息。请注意,这似乎为两条消息提供了附加文件(显示了回形针图标),以及内联附加图像(未显示回形针)。
没有Gmail API,因此IMAP或POP是您唯一真正的选择。 JavaMail API可以提供downloading attachments from IMAP using Perl以及关于previous questions的非常简洁的文章。 SO上的一些PHP example也可以提供帮助。
此{{3}}也可能有所帮助。不幸的是,从我所看到的,imap_header中没有包含附件信息,因此下载正文需要能够看到X-Attachment-Id字段。 (有人请证明我错了)。
答案 5 :(得分:3)
如果您有任何人更新到python 3.3我从HERE获取了2.7脚本并将其更新为3.3。还修复了gmail返回信息的方式的一些问题。
# Something in lines of http://stackoverflow.com/questions/348630/how-can-i-download-all-emails-with-attachments-from-gmail
# Make sure you have IMAP enabled in your gmail settings.
# Right now it won't download same file name twice even if their contents are different.
# Gmail as of now returns in bytes but just in case they go back to string this line is left here.
import email
import getpass, imaplib
import os
import sys
import time
detach_dir = '.'
if 'attachments' not in os.listdir(detach_dir):
os.mkdir('attachments')
userName = input('Enter your GMail username:\n')
passwd = getpass.getpass('Enter your password:\n')
try:
imapSession = imaplib.IMAP4_SSL('imap.gmail.com',993)
typ, accountDetails = imapSession.login(userName, passwd)
if typ != 'OK':
print ('Not able to sign in!')
raise
imapSession.select('Inbox')
typ, data = imapSession.search(None, 'ALL')
if typ != 'OK':
print ('Error searching Inbox.')
raise
# Iterating over all emails
for msgId in data[0].split():
typ, messageParts = imapSession.fetch(msgId, '(RFC822)')
if typ != 'OK':
print ('Error fetching mail.')
raise
#print(type(emailBody))
emailBody = messageParts[0][1]
#mail = email.message_from_string(emailBody)
mail = email.message_from_bytes(emailBody)
for part in mail.walk():
#print (part)
if part.get_content_maintype() == 'multipart':
# print part.as_string()
continue
if part.get('Content-Disposition') is None:
# print part.as_string()
continue
fileName = part.get_filename()
if bool(fileName):
filePath = os.path.join(detach_dir, 'attachments', fileName)
if not os.path.isfile(filePath) :
print (fileName)
fp = open(filePath, 'wb')
fp.write(part.get_payload(decode=True))
fp.close()
imapSession.close()
imapSession.logout()
except :
print ('Not able to download all attachments.')
time.sleep(3)
答案 6 :(得分:3)
这个问题已经过时了,当时Gmail API无法使用。但现在谷歌提供Gmail API来访问IMAP。查看Google的Gmail API here。另请参阅pypi上的google-api-python-client。
答案 7 :(得分:2)
/*based on http://www.codejava.net/java-ee/javamail/using-javamail-for-searching-e-mail-messages*/
package getMailsWithAtt;
import java.io.File;
import java.io.IOException;
import java.text.ParseException;
import java.text.SimpleDateFormat;
import java.util.Date;
import java.util.Properties;
import javax.mail.Address;
import javax.mail.Folder;
import javax.mail.Message;
import javax.mail.MessagingException;
import javax.mail.Multipart;
import javax.mail.NoSuchProviderException;
import javax.mail.Part;
import javax.mail.Session;
import javax.mail.Store;
import javax.mail.internet.MimeBodyPart;
import javax.mail.search.AndTerm;
import javax.mail.search.SearchTerm;
import javax.mail.search.ReceivedDateTerm;
import javax.mail.search.ComparisonTerm;
public class EmailReader {
private String saveDirectory;
/**
* Sets the directory where attached files will be stored.
*
* @param dir
* absolute path of the directory
*/
public void setSaveDirectory(String dir) {
this.saveDirectory = dir;
}
/**
* Downloads new messages and saves attachments to disk if any.
*
* @param host
* @param port
* @param userName
* @param password
* @throws IOException
*/
public void downloadEmailAttachments(String host, String port,
String userName, String password, Date startDate, Date endDate) {
Properties props = System.getProperties();
props.setProperty("mail.store.protocol", "imaps");
try {
Session session = Session.getDefaultInstance(props, null);
Store store = session.getStore("imaps");
store.connect("imap.gmail.com", userName, password);
// ...
Folder inbox = store.getFolder("INBOX");
inbox.open(Folder.READ_ONLY);
SearchTerm olderThan = new ReceivedDateTerm (ComparisonTerm.LT, startDate);
SearchTerm newerThan = new ReceivedDateTerm (ComparisonTerm.GT, endDate);
SearchTerm andTerm = new AndTerm(olderThan, newerThan);
//Message[] arrayMessages = inbox.getMessages(); <--get all messages
Message[] arrayMessages = inbox.search(andTerm);
for (int i = arrayMessages.length; i > 0; i--) { //from newer to older
Message msg = arrayMessages[i-1];
Address[] fromAddress = msg.getFrom();
String from = fromAddress[0].toString();
String subject = msg.getSubject();
String sentDate = msg.getSentDate().toString();
String receivedDate = msg.getReceivedDate().toString();
String contentType = msg.getContentType();
String messageContent = "";
// store attachment file name, separated by comma
String attachFiles = "";
if (contentType.contains("multipart")) {
// content may contain attachments
Multipart multiPart = (Multipart) msg.getContent();
int numberOfParts = multiPart.getCount();
for (int partCount = 0; partCount < numberOfParts; partCount++) {
MimeBodyPart part = (MimeBodyPart) multiPart
.getBodyPart(partCount);
if (Part.ATTACHMENT.equalsIgnoreCase(part
.getDisposition())) {
// this part is attachment
String fileName = part.getFileName();
attachFiles += fileName + ", ";
part.saveFile(saveDirectory + File.separator + fileName);
} else {
// this part may be the message content
messageContent = part.getContent().toString();
}
}
if (attachFiles.length() > 1) {
attachFiles = attachFiles.substring(0,
attachFiles.length() - 2);
}
} else if (contentType.contains("text/plain")
|| contentType.contains("text/html")) {
Object content = msg.getContent();
if (content != null) {
messageContent = content.toString();
}
}
// print out details of each message
System.out.println("Message #" + (i + 1) + ":");
System.out.println("\t From: " + from);
System.out.println("\t Subject: " + subject);
System.out.println("\t Received: " + sentDate);
System.out.println("\t Message: " + messageContent);
System.out.println("\t Attachments: " + attachFiles);
}
// disconnect
inbox.close(false);
store.close();
} catch (NoSuchProviderException e) {
e.printStackTrace();
System.exit(1);
} catch (MessagingException e) {
e.printStackTrace();
System.exit(2);
} catch (IOException ex) {
ex.printStackTrace();
}
}
/**
* Runs this program with Gmail POP3 server
* @throws ParseException
*/
public static void main(String[] args) throws ParseException {
String host = "pop.gmail.com";
String port = "995";
String userName = "user@gmail.com";
String password = "pass";
Date startDate = new SimpleDateFormat("yyyy-MM-dd").parse("2014-06-30");
Date endDate = new SimpleDateFormat("yyyy-MM-dd").parse("2014-06-01");
String saveDirectory = "C:\\Temp";
EmailReader receiver = new EmailReader();
receiver.setSaveDirectory(saveDirectory);
receiver.downloadEmailAttachments(host, port, userName, password,startDate,endDate);
}
}
Maven依赖:
<dependency>
<groupId>com.sun.mail</groupId>
<artifactId>javax.mail</artifactId>
<version>1.5.1</version>
</dependency>
答案 8 :(得分:1)
由于Gmail支持标准协议POP和IMAP,因此任何提供任一协议客户端的平台,工具,应用程序,组件或API都可以正常运行。
我建议谷歌搜索你最喜欢的语言/平台(例如“python”),加上“pop”,加上“imap”,加上可能是“开源”,加上可能是“下载”或“评论”,看看你有什么选择。
有许多免费的应用程序和组件,选择一些看似值得的,检查评论,然后下载和享受。
答案 9 :(得分:1)
你应该知道你需要SSL才能连接到GMail(POP3和IMAP都有 - 这当然也适用于他们的SMTP服务器,除了端口25,但这是另一个故事)。
答案 10 :(得分:1)
以下是我用Groovy(Java平台的动态语言)下载我的银行对帐单所写的内容。
import javax.mail.*
import java.util.Properties
String gmailServer
int gmailPort
def user, password, LIMIT
def inboxFolder, root, StartDate, EndDate
// Downloads all attachments from a gmail mail box as per some criteria
// to a specific folder
// Based on code from
// http://agileice.blogspot.com/2008/10/using-groovy-to-connect-to-gmail.html
// http://stackoverflow.com/questions/155504/download-mail-attachment-with-java
//
// Requires:
// java mail jars in the class path (mail.jar and activation.jar)
// openssl, with gmail certificate added to java keystore (see agileice blog)
//
// further improvement: maybe findAll could be used to filter messages
// subject could be added as another criteria
////////////////////// <CONFIGURATION> //////////////////////
// Maximm number of emails to access in case parameter range is too high
LIMIT = 10000
// gmail credentials
gmailServer = "imap.gmail.com"
gmailPort = 993
user = "gmailuser@gmail.com"
password = "gmailpassword"
// gmail label, or "INBOX" for inbox
inboxFolder = "finance"
// local file system where the attachment files need to be stored
root = "D:\\AttachmentStore"
// date range dd-mm-yyyy
StartDate= "31-12-2009"
EndDate = "1-6-2010"
////////////////////// </CONFIGURATION> //////////////////////
StartDate = Date.parse("dd-MM-yyyy", StartDate)
EndDate = Date.parse("dd-MM-yyyy", EndDate)
Properties props = new Properties();
props.setProperty("mail.store.protocol", "imaps");
props.setProperty("mail.imaps.host", gmailServer);
props.setProperty("mail.imaps.port", gmailPort.toString());
props.setProperty("mail.imaps.partialfetch", "false");
def session = javax.mail.Session.getDefaultInstance(props,null)
def store = session.getStore("imaps")
store.connect(gmailServer, user, password)
int i = 0;
def folder = store.getFolder(inboxFolder)
folder.open(Folder.READ_ONLY)
for(def msg : folder.messages) {
//if (msg.subject?.contains("bank Statement"))
println "[$i] From: ${msg.from} Subject: ${msg.subject} -- Received: ${msg.receivedDate}"
if (msg.receivedDate < StartDate || msg.receivedDate > EndDate) {
println "Ignoring due to date range"
continue
}
if (msg.content instanceof Multipart) {
Multipart mp = (Multipart)msg.content;
for (int j=0; j < mp.count; j++) {
Part part = mp.getBodyPart(j);
println " ---- ${part.fileName} ---- ${part.disposition}"
if (part.disposition?.equalsIgnoreCase(Part.ATTACHMENT)) {
if (part.content) {
def name = msg.receivedDate.format("yyyy_MM_dd") + " " + part.fileName
println "Saving file to $name"
def f = new File(root, name)
//f << part.content
try {
if (!f.exists())
f << part.content
}
catch (Exception e) {
println "*** Error *** $e"
}
}
else {
println "NO Content Found!!"
}
}
}
}
if (i++ > LIMIT)
break;
}
答案 11 :(得分:0)
您是否看过维基百科的GMail 3rd party add-ons?
特别是,PhpGmailDrive是一个开源附加组件,您可以按原样使用,或者可以研究灵感?
答案 12 :(得分:0)
对于Java,您会发现使用G4J。它是一组通过Java与Google Mail进行通信的API(主页上的屏幕截图是围绕此构建的演示电子邮件客户端)