Almost every Python lover knows how to send a request:
import requests
response = requests.get(url, headers, proxies, timeout)
and most of them use multithreading to send requests.
Either normal threading or ThreadPoolExecutor:
executor = ThreadPoolExecutor(max_workers=100)
for response in executor.map(send_request, url, headers, proxies, timeout):
results.append(response)
but have you ever tried asyncio
?
The difference
Let’s talk about the difference.
You can get benefit from concurrency using both ThreadPoolExecutor and asyncio but the main difference is how they implement concurrency.
A computer has limits. ThreadPoolExecutor uses system threads that are allocated and assigned by the operating system, also even if you give an eccentric number for max_workers to ThreadPoolExecutor it’ll be capped to limits. There can be several thousands of threads at most in a system.
On the other hand, one asyncio event loop can operate in a single thread and can support thousands of coroutines at the same time.
async with aiohttp.ClientSession() as session:
for url in urls:
responses.append(asyncio.ensure_future(send_request(url, session)))
But how you can benefit from this?
Since asyncio allows you to operate more requests at the same time, if:
- the system is fast enough to generate and return response
- the system you send requests doesn’t have any throttle,
- or dedicated to you or your system
then you can send thousands of records and process the response at the same time. You can use this in request-heavy duties in your code and achieve a minimum of 5x more request-response handling.