如何在python中进行大文本文件传输?

时间:2013-11-15 17:42:52

标签: python sockets

我正在学习socket编程和python。我需要创建一个服务器,它接受来自客户端的命令并将大文本文件发送回客户端。然后,客户端将完整的文本文件保存到其目录中。在我的代码中,服务器逐行发送整个文本文件,但客户端只能接受1024个字节。我不确定我需要添加什么,以便客户端可以从服务器接收整个文本文件并将其保存到其目录中。有人可以看看我的代码并指出我正确的方向吗?非常感谢您的帮助。

server.py

import socket
import sys
import os
from thread import *

HOST = ''
PORT = 8888

server_socket = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
print 'Socket created'

try:
    server_socket.bind((HOST, PORT))    #bind to a address(and port)
except socket.error, msg:
    print 'Bind failed. Error Code : ' + str(msg[0]) + ' Message ' + msg[1]
    sys.exit()

print 'Socket bind complete'

#put the socket in listening mode
server_socket.listen(10)     #maximum 10 connections
print 'TCP Server Waiting for client on port 30021'

#wait to accept a connection - blocking call
client, addr = server_socket.accept()
#display client information
print 'Connected with ' + addr[0] + ':' + str(addr[1])

try:
    #keep talking with the client
    while 1:
        #Receiving from client
        data = client.recv(1024)

    #command: list
        commandlist = data.split()
        if (commandlist[0].strip() in ('list')):
            reply = 'Directory: ' + os.getcwd()     #current directory
            client.send(reply)
        elif (commandlist[0] == 'get' and len(commandlist) >= 2):
            filename = commandlist[1]
            reply = filename
            #validate filename
            if os.path.exists(filename):
                length = os.path.getsize(filename)
                with open(filename, 'r') as infile:
                    for line in infile:
                        reply = line
                        client.sendall(reply)
                #f = open(filename, 'r')
                #reply = f.read()
                #client.send(piece)
                #f.close()
            else:
                reply = 'File not found'
                client.send(reply)
        elif (commandlist[0] == 'get' and len(commandlist) < 2):
            #if the command argument is less than 2
                reply = 'Provide a filename please'
                client.send(reply)
        else:
            reply = 'Error: Wrong command'
            client.send(reply)


except KeyboardInterrupt:
    print "Exiting gracefully."
finally:
    server_socket.close()

client.py

#Socket clicent in python

import socket   #for sockets
import sys      #for exit

command = ' '
socksize = 1024

#return a socket descriptor which can be used in other socket related functions
#properties: address family: AF_INET (IP v4)
#properties: type: SOCK_STREAM (connection oriented TCP protocol)

try:
    #create an AF_INET, STREAM socket (TCP)
    client_socket = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
except socket.error, msg:               #error handling
    print 'Failed to create socket. Error code: ' + str(msg[0]) + ', Error message: ' + msg[1]
    sys.exit();

print 'Socket Created'

#Get the IP address of the remote host/url
#connect to IP on a certain 'port' using the connect
host = ''       #symbolic name meaning the local host
port = 8888     #arbitrary non-privileged port

try:
    remote_ip = socket.gethostbyname(host)
except socket.gaierror:
    #could not resolve
    print 'Hostname could not be resolved. Existing'
    sys.exit()
print 'IP address of ' + host + ' is ' + remote_ip

#Connect to remote server
client_socket.connect((remote_ip, port))
print 'Socket Connected to ' + host + ' on ip ' + remote_ip

buf = ''
#Send some data to remote server
while True:
    print 'Enter a command: list or get <filename>'
    command = raw_input()
    if command.strip() == 'quit':
        break
    client_socket.send(command)

    data = client_socket.recv(socksize)
    print data

#Close the socket
client_socket.close()

4 个答案:

答案 0 :(得分:2)

通过TCP传输数据时,每次调用recv都不能保证为您提供所要求的确切字节数,但它也不会给您提供超出您要求的数量。如果您要求1024,它可能会提供2个字节,200个字节甚至500个字节。它也可能会为您的缓冲区填充您要求的数量,但并非总是如此。由于文件大小不同,您可以发送一个初始数据包(比如4个字节)来指定文件的大小,这样接收者就知道要多少钱。然后,您可以分配该大小的缓冲区(或使用1024字节大的缓冲区),并不断收集并写入您收到的任何内容,直到最初指定的数量。

为了简化操作,我建议使用某种类型的机制来帮助您传输数据,例如Google Protocol Buffers

我特别关注Beej's guide to socket programming(特别是recv部分)。虽然它在C语言中,但它肯定会转换为Python。

答案 1 :(得分:2)

问题是客户端必须从头开始知道文件的大小,否则它将无法知道它必须读取多少数据。已有协议(如HTTP)为您处理它,但您可以自己实现它。像这样:

<强> server.py

if os.path.exists(filename):
    length = os.path.getsize(filename)
    client.send(convert_to_bytes(length)) # has to be 4 bytes
    with open(filename, 'r') as infile:
        d = infile.read(1024)
        while d:
            client.send(d)
            d = infile.read(1024)

<强> client.py

client_socket.send(command)

if command.strip().startswith("get"):
    size = client_socket.recv(4) # assuming that the size won't be bigger then 1GB
    size = bytes_to_number(size)
    current_size = 0
    buffer = b""
    while current_size < size:
        data = client_socket.recv(socksize)
        if not data:
            break
        if len(data) + current_size > size:
            data = data[:size-current_size] # trim additional data
        buffer += data
        # you can stream here to disk
        current_size += len(data)
    # you have entire file in memory

请注意,这只是一个想法,并且涉及许多问题。

当然你必须实现convert_to_bytes总是必须返回4个字节(或其他固定数字),否则协议将被破坏,必要时添加零)和{{ 1}}函数。

此类实施的示例:

bytes_to_number

答案 2 :(得分:0)

你的问题在这里:

while True:
    print 'Enter a command: list or get <filename>'
    command = raw_input()
    if command.strip() == 'quit':
        break
    client_socket.send(command)

    data = client_socket.recv(socksize)
    print data

#Close the socket
client_socket.close()

您的recv具有最大大小,因此您无法接收大于此数据包的数据包。但是数据包的大小不限,因此通常可以通过多个数据包传输文件。

你需要的是循环一个recv()并继续读取它并写入你的文件。

然后,简单的解决方案是让服务器在文件完成时关闭连接,这样客户端就会注意到这个事件,你可以打开一个新的连接并重复这个过程。

如果你真的想重新使用相同的连接,最好的办法是提前传输数据的大小,然后开始读取,计算字节数。但是你需要再次阅读循环。

如果您事先并不知道要发送的数据的大小,那么它会变得混乱,因为您必须发送一个特殊值来表示文件的结尾,但是如果该值出现在文件中,您也遇到了麻烦,因此您还需要对文件应用一些编码,以确保特殊值永远不会出现在文件中。

答案 3 :(得分:0)

我将@freakish代码转换为休假;

def convert_to_bytes(no):
    result = bytearray()
    result.append(no & 255)
    for i in range(3):
        no = no >> 8
        result.append(no & 255)
    return result

def bytes_to_number(b):
    # if Python2.x
    # b = map(ord, b)
    res = 0
    for i in range(4):
        res += b[i] << (i*8)
    return res

def myReceive(sock):
    socksize = 1024

    size = sock.recv(4) # assuming that the size won't be bigger then 1GB
    size = bytes_to_number(size)
    current_size = 0
    myBuffer = b""
    while current_size < size:
        data = sock.recv(socksize)
        if not data:
            break
        if len(data) + current_size > size:
            data = data[:size-current_size] # trim additional data
        myBuffer += data
        # you can stream here to disk
        current_size += len(data)
    return myBuffer

def mySend(sock, data):
    length = len(data)
    sock.send(convert_to_bytes(length)) # has to be 4 bytes
    byte = 0
    while byte < length:
        sock.send(data[byte:byte+1024])
        byte += 1024

谢谢他。我在项目中使用了这些功能。他回答得很好。

示例tcpClient.py

import socket
import os

def convert_to_bytes(no):
    result = bytearray()
    result.append(no & 255)
    for i in range(3):
        no = no >> 8
        result.append(no & 255)
    return result

def bytes_to_number(b):
    # if Python2.x
    # b = map(ord, b)
    res = 0
    for i in range(4):
        res += b[i] << (i*8)
    return res

def mySend(sock, data):

    length = len(data)
    print("length: ", length)
    sock.send(convert_to_bytes(length)) # has to be 4 bytes
    byte = 0
    while byte < length:
        sock.send(data[byte:byte+1024])
        byte += 1024

soc = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
host = "127.0.0.1"
port = 8888

try:
    soc.connect((host, port))
except:
    print("Connection error")
    sys.exit()


filename = "/media/yilmaz/kuran.zip"

if os.path.exists(filename):
    length = os.path.getsize(filename)
    with open(filename, 'rb') as infile:
        data = infile.read()


mySend(soc, data)


print("All data send", flush=True)

示例tcpServer.py

import socket

host = ''
port = 8888         # arbitrary non-privileged port

soc = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
soc.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR, 1)

def convert_to_bytes(no):
    result = bytearray()
    result.append(no & 255)
    for i in range(3):
        no = no >> 8
        result.append(no & 255)
    return result

def bytes_to_number(b):
    # if Python2.x
    # b = map(ord, b)
    res = 0
    for i in range(4):
        res += b[i] << (i*8)
    return res

def myReceive(sock):
    socksize = 1024

    size = sock.recv(4) # assuming that the size won't be bigger then 1GB
    size = bytes_to_number(size)
    current_size = 0
    myBuffer = b""
    while current_size < size:
        data = sock.recv(socksize)
        if not data:
            break
        if len(data) + current_size > size:
            data = data[:size-current_size] # trim additional data
        myBuffer += data
        # you can stream here to disk
        current_size += len(data)

    return myBuffer

print("Socket created",flush=True)

try:
    soc.bind((host, port))
except:
    print("Bind failed. Error : " + str(sys.exc_info()), flush=True)
    sys.exit()

soc.listen(5)       # queue up to 5 requests
print("Socket now listening", flush=True)



client_socket, address = soc.accept()

print("Connected", flush=True)

myBuffer = myReceive(client_socket)

print("All data received",flush=True)

with open("aaa", "wb") as f:
    f.write(myBuffer)

print("All data written",flush=True)

soc.close()