libcurl ffi bindings for Python, with impersonation support
Python binding for curl-impersonate via cffi.
Unlike other pure python http clients like httpx
or requests
, curl_cffi
can
impersonate browsers' TLS signatures or JA3 fingerprints. If you are blocked by some
website for no obvious reason, you can give this package a try.
asyncio
with proxy rotation on each request.library | requests | aiohttp | httpx | pycurl | curl_cffi |
---|---|---|---|---|---|
http2 | ❌ | ❌ | ✅ | ✅ | ✅ |
sync | ✅ | ❌ | ✅ | ✅ | ✅ |
async | ❌ | ✅ | ✅ | ❌ | ✅ |
fingerprints | ❌ | ❌ | ❌ | ❌ | ✅ |
speed | 🐇 | 🐇🐇 | 🐇 | 🐇🐇 | 🐇🐇 |
pip install curl_cffi --upgrade
This should work on Linux(x86_64/aarch64), macOS(Intel/Apple Silicon) and Windows(amd64).
If it does not work on you platform, you may need to compile and install curl-impersonate
first and set some environment variables like LD_LIBRARY_PATH
.
To install beta releases:
pip install curl_cffi --pre
from curl_cffi import requests
# Notice the impersonate parameter
r = requests.get("https://tls.browserleaks.com/json", impersonate="chrome110")
print(r.json())
# output: {..., "ja3n_hash": "aa56c057ad164ec4fdcb7a5a283be9fc", ...}
# the js3n fingerprint should be the same as target browser
# http/socks proxies are supported
proxies = {"https": "http://localhost:3128"}
r = requests.get("https://tls.browserleaks.com/json", impersonate="chrome110", proxies=proxies)
proxies = {"https": "socks://localhost:3128"}
r = requests.get("https://tls.browserleaks.com/json", impersonate="chrome110", proxies=proxies)
# sessions are supported
s = requests.Session()
# httpbin is a http test website
s.get("https://httpbin.org/cookies/set/foo/bar")
print(s.cookies)
# <Cookies[<Cookie foo=bar for httpbin.org />]>
r = s.get("https://httpbin.org/cookies")
print(r.json())
# {'cookies': {'foo': 'bar'}}
Supported impersonate versions, as supported by curl-impersonate:
from curl_cffi.requests import AsyncSession
async with AsyncSession() as s:
r = await s.get("https://example.com")
More concurrency:
import asyncio
from curl_cffi.requests import AsyncSession
urls = [
"https://googel.com/",
"https://facebook.com/",
"https://twitter.com/",
]
async with AsyncSession() as s:
tasks = []
for url in urls:
task = s.get("https://example.com")
tasks.append(task)
results = await asyncio.gather(*tasks)
Alternatively, you can use the low-level curl-like API:
from curl_cffi import Curl, CurlOpt
from io import BytesIO
buffer = BytesIO()
c = Curl()
c.setopt(CurlOpt.URL, b'https://tls.browserleaks.com/json')
c.setopt(CurlOpt.WRITEDATA, buffer)
c.impersonate("chrome110")
c.perform()
c.close()
body = buffer.getvalue()
print(body.decode())
See the docs for more details.
If you are using scrapy, check out this middleware: tieyongjie/scrapy-fingerprint