Linux Performance Tuning: Fixing the "Too Many Open Files" error
If you are scaling a web scraper, a high-traffic API, or a database cluster, you will eventually hit the dreaded OSError: [Errno 24] Too many open files.
In Linux, "everything is a file." A TCP connection to a proxy, a log file, a database socket—they all consume a File Descriptor (FD). By default, most Linux distributions cap this at 1,024 per user. For high-throughput data engineering, this limit is laughably low. Here is how to raise it permanently.
1. Diagnosing the Limit
Before applying fixes, check your current limits. Run this command as the user running your application (not just as root).
# Check Soft Limit (The active limit)
ulimit -n
# Check Hard Limit (The maximum cap allowed)
ulimit -Hn
If the output is 1024, your data pipeline will crash the moment you try to open your 1,025th concurrent connection.
2. The Permanent User Fix
To make the change persist across reboots for a specific user (or all users), you need to edit the security limits configuration.
Open the file:
sudo nano /etc/security/limits.conf
Add the following lines at the end of the file. The wildcard * applies to all users. If you are running your app as a specific user (e.g., ubuntu or deploy), replace the * with that username.
# /etc/security/limits.conf
* soft nofile 65535
* hard nofile 65535
root soft nofile 65535
root hard nofile 65535
Soft vs. Hard Limits
The Soft Limit is the actual current limit. The Hard Limit is the ceiling that a user can raise their own soft limit to. We usually set both to 65,535 for servers.
3. The System-Wide Cap
Raising the user limit doesn't help if the operating system kernel has a global cap. For massive scraping clusters, verify the system-wide maximum.
# Check current system max
cat /proc/sys/fs/file-max
If this is low, edit /etc/sysctl.conf and add:
fs.file-max = 2097152
Apply the changes immediately:
sudo sysctl -p
4. The "Systemd" Trap (Crucial)
This is where 90% of engineers get stuck. Systemd services (like Gunicorn, Nginx, or Docker) often ignore /etc/security/limits.conf. They have their own limit configurations.
If you are running your app as a service, you must edit the service unit file.
sudo systemctl edit your-service-name
Add this override block:
[Service]
LimitNOFILE=65535
Then reload the daemon and restart the service:
sudo systemctl daemon-reload
sudo systemctl restart your-service-name
5. Verification
Don't just assume it worked. Check the limits of the running process. Find the Process ID (PID) of your application and inspect its specific limits.
# Find PID
ps aux | grep python
# Check limits (Replace 1234 with actual PID)
cat /proc/1234/limits | grep "Max open files"
You should see:
Max open files 65535 65535 files
Scaling beyond a single server?
When 65,000 connections aren't enough, it's time to distribute. Contact Comquest engineering to architect horizontally scalable data systems.