Python中多线程和多处理的初学者指南

举

2020-04-29

关注关注

即将开播：4月29日，民生银行郭庆谈商业银行金融科技赋能的探索与实践

使用Python分析数据，如果使用了正确的数据结构和算法，有时可以大量提高程序的速度。实现此目的的一种方法是使用Muiltithreading(多线程)或Multiprocessing(多重处理)。

在这篇文章中，我们不会详细讨论多线程或多处理的内部原理。相反，我们举一个例子，编写一个小的Python脚本从Unsplash下载图像。我们将从一次下载一个图像的版本开始。接下来，我们使用线程来提高执行速度。

Python中多线程和多处理的初学者指南

多线程

简单地说，线程允许您并行地运行程序。花费大量时间等待外部事件的任务通常适合线程化。它们也称为I/O Bound任务例如从文件中读写，网络操作或使用API在线下载。让我们来看一个示例，它展示了使用线程的好处。

1. 没有线程

在本例中，我们希望通过顺序运行程序来查看从Unsplash API下载15张图像需要多长时间：

import requests 
import time 
img_urls = [ 
    'https://images.unsplash.com/photo-1516117172878-fd2c41f4a759', 
    'https://images.unsplash.com/photo-1532009324734-20a7a5813719', 
    'https://images.unsplash.com/photo-1524429656589-6633a470097c', 
    'https://images.unsplash.com/photo-1530224264768-7ff8c1789d79', 
    'https://images.unsplash.com/photo-1564135624576-c5c88640f235', 
    'https://images.unsplash.com/photo-1541698444083-023c97d3f4b6', 
    'https://images.unsplash.com/photo-1522364723953-452d3431c267', 
    'https://images.unsplash.com/photo-1513938709626-033611b8cc03', 
    'https://images.unsplash.com/photo-1507143550189-fed454f93097', 
    'https://images.unsplash.com/photo-1493976040374-85c8e12f0c0e', 
    'https://images.unsplash.com/photo-1504198453319-5ce911bafcde', 
    'https://images.unsplash.com/photo-1530122037265-a5f1f91d3b99', 
    'https://images.unsplash.com/photo-1516972810927-80185027ca84', 
    'https://images.unsplash.com/photo-1550439062-609e1531270e', 
    'https://images.unsplash.com/photo-1549692520-acc6669e2f0c' 
] 
 
start = time.perf_counter() #start timer 
for img_url in img_urls: 
    img_name = img_url.split('/')[3] #get image name from url 
    img_bytes = requests.get(img_url).content 
with open(img_name, 'wb') as img_file: 
     img_file.write(img_bytes) #save image to disk  
 
finish = time.perf_counter() #end timer 
print(f"Finished in {round(finish-start,2)} seconds")  
 
#results 
Finished in 23.101926751 seconds

一共用时23秒。

2. 多线程

让我们看看Pyhton中的线程模块如何显著地改进我们的程序执行：

import time 
from concurrent.futures import ThreadPoolExecutor 
 
def download_images(url): 
    img_name = img_url.split('/')[3] 
    img_bytes = requests.get(img_url).content 
    with open(img_name, 'wb') as img_file: 
         img_file.write(img_bytes) 
         print(f"{img_name} was downloaded") 
 
start = time.perf_counter() #start timer 
with ThreadPoolExecutor() as executor: 
    results = executor.map(download_images,img_urls) #this is Similar to map(func, *iterables) 
finish = time.perf_counter() #end timer 
print(f"Finished in {round(finish-start,2)} seconds") 
 
#results  
Finished in 5.544147536 seconds

我们可以看到，与不使用线程代码相比，使用线程代码可以显著提高速度。从23秒到5秒。

多线程 python python多线程线程编程语言

安科网

Python中多线程和多处理的初学者指南

举

即将开播：4月29日，民生银行郭庆谈商业银行金融科技赋能的探索与实践

举

相关推荐

多线程真的比单线程快？

多线程中如何使用gdb精确定位死锁问题

Python多线程

python 多线程 QTimer实现多线程

Python-多线程

第54天：Python 多线程 Event

第53天： Python 线程池

第49天：Python 多线程之 threading 模块

Python多线程之死锁

Python中的多线程如何正确运用？案例详解

Python 多线程

Python中的多处理与多线程：新手简介

多线程默认情况,守护线程及join对子线程运行的影响

python多线程实现方式，最基础的实现方式模块是什么

代码详解Python多线程、多进程、协程

Java+Linux，深入内核源码讲解多线程之进程

深究 Linux 多线程中的信号量 Semaphore

Linux平台服务器多线程开发（一）

Linux平台服务器多线程开发（一）

C语言多线程

C++并发编程实战：如何为多线程性能设计数据结构？

C# 多线程的阻塞和继续

举