Okay, so your boss calls you at 8. pm to tell you, that the application has stopped. You log in to your machine and you don't know what has happened. Then, while checking the logs you find this culprit:
xxxx xxx Too many open files.
#1 Understanding the problem
If you are not familiar with linux, you may be confused. Everything in linux is considered as a file, even your keyboard. To access those files, file descriptors are used.
File descriptor is a number that uniquely identifies opened file within an operating system. In a unix-like system, file descriptor refers to any file type named in a file system. For example as:
- Regular file
- Unix domain sockets
- Named pipes
- Anonymous pipes
- Network sockets
#2 Possible causes
- Processes running on your OS are leaking resources, which means that the process opened to many streams that were not closed at all
- Your process is leaking resources
- Too many unclosed connections were opened on your server
- The limit of number of opened FD is too low
Get process ID of your application
ps aux | grep yourAppName
To check limits set to your process:
cat /proc/processID/limits
When you encounter this problem, you should immediately look for output of the lsof command:
man lsof
lists all open files beionging to all active processes, in absence of any options.
In the absence of any options, lsof lists all open files belonging to all active processes. If any list request option is specified, other list requests must be specifically requested - e.g., if -U is specified for the listing of UNIX socket files, NFS files won't be listed unless -N is also speci- fied; or if a user list is specified with the -u option, UNIX domain socket files, belonging to users not in the list, won't be listed unless the -U option is also specified
From lsof command you can safely tell which application is causing your problems. You can look for the ID of that process:
To check how many FD were opened by your process, you can use following command:
lsof -a -p {processID} | wc -l
Then if you need to identify every opened file descriptor by your application:
lsof -a -p {processID}
To check user limits of FD:
ulimit -n
This returns the maximum number of opened file descriptors.
More information on limits is stored in the following file: /etc/security/limits.conf
As you can read in the first two commented lines:
# This file sets the resource limits for the users logged in via PAM.# It does not affect resource limits of the system services.
More about limits conf can be found here.
#3 Fixing
Basically, there are only 2 causes of this problem, but the second one is hard to find when dealing with big projects.
#3.1 FD limit is too low
If your application is running on a web server, there may be more connections opened to your server than is allowed by your linux distro. You can increase the hard and soft limits following way:
User limit:
To change user limits edit conf. file:
vi /etc/security/limits.conf
System wide limit:
sysctl -w fs.file-max=100000 echo "fs.file-max = 100000" >> /etc/sysctl.conf
and then relog the user.
#3.2 App is leaking resources (not closing streams)
If this is the problem, you need to close streams opened by your app. If you encountered this problem in the application written java, your resources may be leaking. You can use a new (it's not new in the time of writing) feature of try (try with resources). It automatically closes all streams that may cause problems in the future.
PS:
- Check out more information about hard limit and soft limit settings in Linux. Good answers are found here and here
If you have any questions or any tips what to look for when this problem happens, feel free to leave a comment down below!
Sources:
https://ss64.com/bash/ulimit.html
https://linux.die.net/man/8/lsof
https://www.cyberciti.biz/faq/linux-find-all-file-descriptors-used-by-a-process/
https://www.cyberciti.biz/faq/linux-increase-the-maximum-number-of-open-files/
Nice manual, many thanks Sir! Ragu from Delhi
Nice post. I used to be checking continuously this blog
and I am impressed! Very useful information specially the
last part 🙂 I handle such information much.
I was seeking this certain info for a very long time.
Thanks and good luck.