Mysql报Too many connections,不要乱用ulimit了,看看如何正确修改进程的最大文件数

背景

今天在学习mysql时,看到一个案例,大体来说,就是客户端报Too many connections。但是,客户端的连接池,限制为了200,两个客户端java进程,那也才400,然后mysql配置了800的连接。

mysql是在my.cnf中配置了:

[ CAD_OneKeyDeploy]# vim /etc/my.cnf

[mysqld]
datadir = /var/lib/mysql
socket = /var/lib/mysql/mysql.sock
max_connections=800
symbolic-links = 0

这个不应该吧,我最多建立400个连接,数据库设置了最大连接为800,结果就报:Too many connections。

然后在mysql上执行:

SHOW VARIABLES LIKE ‘max_connections‘

发现结果是200左右,说明设置了没生效啊。

结果在mysql的启动日志:

Could not increase number of max_open_files to more than mysqld (request: 65535)
Changed limits: max_connections: 214 (requested 2000)

然后案例中的博主就去修改了:

ulimit -HSn 65535
vim /etc/security/limits.conf

我跟着操作了一把,结果,发现并不是那么回事。

也就是说,我照着做了,没生效,那,到底怎么回事?

这里,我总结了一些资料,大家先看看,然后最后我会说明,怎么去正确设置一个进程的最大文件数量。

进程的最大文件数量,受到多方面影响:

登录shell后,手动启动的进程

影响因素包括:

  • 操作系统整体的、所有进程可以打开的文件数量总和(由/proc/sys/fs/file-max控制);
  • 该shell用户,可以打开的最大文件数量(由 /etc/security/limits.conf控制);
  • 进程本身的最大文件数量限制(可以通过os提高的api控制,如setrlimit)

开机自动启动的进程

影响因素包括:

  • 操作系统整体的、所有进程可以打开的文件数量总和(由/proc/sys/fs/file-max控制);
  • 如果由systemd方式启动,则systemd的service文件中可以进行限制。

受知识所限,目前知道的就上面这些,shell据我所知,还分log和no-log shell,不是很懂,先跳过。

查看某个已运行进程的资源限制

在经过一番修改后,想知道修改是否生效时,可以通过如下方式:

先把进程运行起来,查看其最终生效的资源限制的办法,就是如下:

[ ~]# cat /proc/6660/limits 
Limit                     Soft Limit           Hard Limit           Units     
Max cpu time              unlimited            unlimited            seconds   
Max file size             unlimited            unlimited            bytes     
Max data size             unlimited            unlimited            bytes     
Max stack size            8388608              unlimited            bytes     
Max core file size        0                    unlimited            bytes     
Max resident set          unlimited            unlimited            bytes     
Max processes             3795                 3795                 processes 
-----------------------------如下
Max open files            5000                 5000                 files     
Max locked memory         65536                65536                bytes     
Max address space         unlimited            unlimited            bytes     
Max file locks            unlimited            unlimited            locks     
Max pending signals       3795                 3795                 signals   
Max msgqueue size         819200               819200               bytes     
Max nice priority         0                    0                    
Max realtime priority     0                    0                    
Max realtime timeout      unlimited            unlimited            us

os级别,如何查看与设置

查看可通过如下命令:

[ ~]# cat  /proc/sys/fs/file-max
95086

这里其实就是查看了/proc下的某个文件,那么这个文件的意思是啥呢?查看帮助:

man proc
...
/proc/sys/fs/file-max

This file defines a system-wide limit on the number of open files for all processes.  (See also setrlimit(2), which can be used by a process to set the per-process limit, RLIMIT_NOFILE, on the number of files  it  may open.)  If you get lots of error messages in the kernel log about running out of file handles (look for "VFS: file-max limit <number> reached"), try increasing this value:

echo 100000 > /proc/sys/fs/file-max

The kernel constant NR_OPEN imposes an upper limit on the value that may be placed in file-max.

If you increase /proc/sys/fs/file-max, be sure to increase /proc/sys/fs/inode-max to 3-4 times the new value of /proc/sys/fs/file-max, or you will run out of inodes.

Privileged processes (CAP_SYS_ADMIN) can override the file-max limit.

简单翻译下,该文件定义了一个操作系统级别的,最大可以打开的文件数量限制(针对所有进程加起来)。

针对一个进程,去设置针对每个进程的最大可打开文件数量的限制,可以查看setrlimit中的RLIMIT_NOFILE

如果要修改,则:

echo 100000 > /proc/sys/fs/file-max

修改后,重启os生效。

coding时,调用api进行资源限制

可通过man查看:

man setrlimit

GETRLIMIT(2)                                                                                           Linux Programmer‘s Manual                                                                                           GETRLIMIT(2)

NAME
       getrlimit, setrlimit, prlimit - get/set resource limits

SYNOPSIS
       #include <sys/time.h>
       #include <sys/resource.h>

       int getrlimit(int resource, struct rlimit *rlim);
       int setrlimit(int resource, const struct rlimit *rlim);

		The getrlimit() and setrlimit() system calls get and set resource limits respectively.  		Each resource has an associated soft and hard limit, as defined by the rlimit structure:

           struct rlimit {
               rlim_t rlim_cur;  /* Soft limit */
               rlim_t rlim_max;  /* Hard limit (ceiling for rlim_cur) */
           };

注意,这个都是针对当前进程的,即,如果你用c语言编程,基本就会直接和这个打交道。

RLIMIT_NPROC

The maximum number of processes (or, more precisely on Linux, threads) that can be created for the real user ID of the calling process.  Upon encountering this limit, fork(2) fails with the error EAGAIN.

我在redis的源码中,见过相关的api调用。

针对单个用户/用户组的资源限制(修改后,重新登录shell后,启动的进程生效)

查看:

vim /etc/security/limits.conf

如果要修改,则在上述文件中,增加如下两行,前面的通配符,表示匹配任意用户和group

* soft nofile 65535
* hard nofile 65535

这里把任意用户的最大文件数量,改为了65535.

另外,我这里看了下elastic search的官网,因为我记得它就是比较繁琐,需要改这些,

https://www.elastic.co/guide/en/elasticsearch/reference/master/setting-system-settings.html#ulimit

其中有如下一段话:

On Linux systems, persistent limits can be set for a particular user by editing the /etc/security/limits.conf file. To set the maximum number of open files for the elasticsearch user to 65,535, add the following line to the limits.conf file:

elasticsearch  -  nofile  65535

This change will only take effect the next time the elasticsearch user opens a new session.

意思是,在linux上,针对某个用户的持久化资源,可以通过设置/etc/security/limits.conf。

比如,要设置elasticsearch用户的最大文件数量为65535,需要增加如下行:

elasticsearch  -  nofile  65535

注意,上面还说了,该change只在下次 elasticsearch 开启一个新session时生效。

上面那个elasticsearch的文档,真心不错,大家可以看看。

所以,这个是shell级别的,修改后,要重新登录shell,该修改文件才生效,同时,要在登录shell后,启动的进程才有用。

简单测试

我们修改该文件为:

* soft nofile 6666
* hard nofile 6666

然后启动一个进程,监听1235端口:

[ ~]# nc -l 1235

然后在另外一个shell中,查看该进程的资源信息:

[ ~]#       netstat -nltp
Active Internet connections (only servers)
Proto Recv-Q Send-Q Local Address           Foreign Address         State       PID/Program name    
tcp        0      0 0.0.0.0:1235            0.0.0.0:*               LISTEN      7015/nc             
tcp        0      0 0.0.0.0:22              0.0.0.0:*               LISTEN      6632/sshd           
tcp6       0      0 :::3306                 :::*                    LISTEN      6670/mysqld         
tcp6       0      0 :::1235                 :::*                    LISTEN      7015/nc             
tcp6       0      0 :::22                   :::*                    LISTEN      6632/sshd           
[ ~]# cat /proc/7015/limits 
Limit                     Soft Limit           Hard Limit           Units     
Max cpu time              unlimited            unlimited            seconds   
Max file size             unlimited            unlimited            bytes     
Max data size             unlimited            unlimited            bytes     
Max stack size            8388608              unlimited            bytes     
Max core file size        0                    unlimited            bytes     
Max resident set          unlimited            unlimited            bytes     
Max processes             3795                 3795                 processes 
Max open files            65535                65535                files

这里的最后一行可以发现,文件数量为65535.

然后我们关闭shell,重新登录,此时,就会去执行新的/etc/security/limits.conf,设置为6666,然后我们启动个进程:

[ ~]# nc -l 1236

另一个shell中查看该进程:

[ ~]#       netstat -nltp
Active Internet connections (only servers)
Proto Recv-Q Send-Q Local Address           Foreign Address         State       PID/Program name    
tcp        0      0 0.0.0.0:1236            0.0.0.0:*               LISTEN      7089/nc             
tcp        0      0 0.0.0.0:22              0.0.0.0:*               LISTEN      6632/sshd           
tcp6       0      0 :::3306                 :::*                    LISTEN      6670/mysqld         
tcp6       0      0 :::1236                 :::*                    LISTEN      7089/nc             
tcp6       0      0 :::22                   :::*                    LISTEN      6632/sshd           
[ ~]# cat /proc/7089/limits 
Limit                     Soft Limit           Hard Limit           Units     
Max cpu time              unlimited            unlimited            seconds   
Max file size             unlimited            unlimited            bytes     
Max data size             unlimited            unlimited            bytes     
Max stack size            8388608              unlimited            bytes     
Max core file size        0                    unlimited            bytes     
Max resident set          unlimited            unlimited            bytes     
Max processes             3795                 3795                 processes 
Max open files            6666                 6666                 files

可以看到,已经变成6666了。

总结一下,该方式,修改文件后,需要重启shell后生效,且需要是在该shell中启动的进程才生效。

ulimit 方式(不推荐)

注意,该种方式,仅当前shell生效,且,仅在修改后,启动的进程才有效。

比如,我们这里在shell1下,修改:

[ ~]# ulimit -HSn 9999
[ ~]# nc -l 1234

然后启动了一个进程,监听1234端口。

然后我们看看该进程的资源信息:

[ ~]#       netstat -nltp
Active Internet connections (only servers)
Proto Recv-Q Send-Q Local Address           Foreign Address         State       PID/Program name    
tcp        0      0 0.0.0.0:1234            0.0.0.0:*               LISTEN      6982/nc             
tcp        0      0 0.0.0.0:22              0.0.0.0:*               LISTEN      6632/sshd           
tcp6       0      0 :::3306                 :::*                    LISTEN      6670/mysqld         
tcp6       0      0 :::1234                 :::*                    LISTEN      6982/nc             
tcp6       0      0 :::22                   :::*                    LISTEN      6632/sshd           
[ ~]# cat /proc/6982/limits 
Limit                     Soft Limit           Hard Limit           Units     
Max cpu time              unlimited            unlimited            seconds   
Max file size             unlimited            unlimited            bytes     
Max data size             unlimited            unlimited            bytes     
Max stack size            8388608              unlimited            bytes     
Max core file size        0                    unlimited            bytes     
Max resident set          unlimited            unlimited            bytes     
Max processes             3795                 3795                 processes 
Max open files            9999                 9999                 files

然后,在我关闭该shell,重新登录shell进来后,执行

[ ~]# nc -l 1234

此时再去查看:

[ ~]#       netstat -nltp
Active Internet connections (only servers)
Proto Recv-Q Send-Q Local Address           Foreign Address         State       PID/Program name    
tcp        0      0 0.0.0.0:1234            0.0.0.0:*               LISTEN      7007/nc             
tcp        0      0 0.0.0.0:22              0.0.0.0:*               LISTEN      6632/sshd           
tcp6       0      0 :::3306                 :::*                    LISTEN      6670/mysqld         
tcp6       0      0 :::1234                 :::*                    LISTEN      7007/nc             
tcp6       0      0 :::22                   :::*                    LISTEN      6632/sshd           

[ ~]# cat /proc/7007/limits 
Limit                     Soft Limit           Hard Limit           Units     
Max cpu time              unlimited            unlimited            seconds   
Max file size             unlimited            unlimited            bytes     
Max data size             unlimited            unlimited            bytes     
Max stack size            8388608              unlimited            bytes     
Max core file size        0                    unlimited            bytes     
Max resident set          unlimited            unlimited            bytes     
Max processes             3795                 3795                 processes 
--------------------------- 1
Max open files            65535                65535                files     
Max locked memory         65536                65536                bytes     
Max address space         unlimited            unlimited            bytes

注意,这里1处显示,已经变成了65535.

注意这个的限制,是设置了该值后,在此之后启动的进程才有用;而且对于开机自启的进程,应该是没什么用的。

我不建议这种方式。

ulimit [-HSTabcdefilmnpqrstuvx [limit]]

Provides  control over the resources available to the shell and to processes started by it, on systems that allow such control.

如果要长久生效,可以这样:

echo ulimit -SHn 65535 >> /etc/profile

但是,这个是只针对由该shell启动的进程。比如开机启动的那些,比如mysql,应该是没法用这个控制的。

当mysql使用systemd方式开机自启时,怎么正确修改

我们这边的mysql,是使用rpm安装的,安装的时候,就用了如下命令:

sed -i ‘/\# End of file/i * soft nofile 65535‘ /etc/security/limits.conf
    sed -i ‘/\# End of file/i * hard nofile 65535‘ /etc/security/limits.conf
    echo ulimit -SHn 65535 >> /etc/profile
    source /etc/profile

同时,也修改了其配置文件:

/etc/my.cnf

[ CAD_OneKeyDeploy]# vim /etc/my.cnf

[mysqld]
datadir = /var/lib/mysql
socket = /var/lib/mysql/mysql.sock
max_connections=10000
symbolic-links = 0

比如这里的max_connections=10000。

但是,开机重启后,使用:

[ CAD_OneKeyDeploy]# cat /proc/6677/limits 
Limit                     Soft Limit           Hard Limit           Units     
Max cpu time              unlimited            unlimited            seconds   
Max file size             unlimited            unlimited            bytes     
Max data size             unlimited            unlimited            bytes     
Max stack size            8388608              unlimited            bytes     
Max core file size        0                    unlimited            bytes     
Max resident set          unlimited            unlimited            bytes     
Max processes             3795                 3795                 processes 
// 1------------------
Max open files            5000                 5000                 files     
Max locked memory         65536                65536                bytes

上面1处,最大打开文件是5000,这是为啥呢?说明没效果啊。

为啥呢,找了半天,发现我们的mysql是通过systemd方式启动的。

[ CAD_OneKeyDeploy]#  systemctl status mysqld
● mysqld.service - MySQL Server
   // 1
   Loaded: loaded (/usr/lib/systemd/system/mysqld.service; enabled; vendor preset: disabled)
   Active: active (running) since Sat 2020-07-18 10:58:07 CST; 44min ago
     Docs: man:mysqld(8)
           http://dev.mysql.com/doc/refman/en/using-systemd.html
  Process: 6674 ExecStart=/usr/sbin/mysqld --daemonize --pid-file=/var/run/mysqld/mysqld.pid $MYSQLD_OPTS (code=exited, status=0/SUCCESS)
  Process: 6636 ExecStartPre=/usr/bin/mysqld_pre_systemd (code=exited, status=0/SUCCESS)
 Main PID: 6677 (mysqld)
   CGroup: /system.slice/mysqld.service
           └─6677 /usr/sbin/mysqld --daemonize --pid-file=/var/run/mysqld/mysqld.pid

Jul 18 10:58:05 localhost.localdomain systemd[1]: Starting MySQL Server...
Jul 18 10:58:07 localhost.localdomain systemd[1]: Started MySQL Server.

上面1处,指定了该service的位置:

/usr/lib/systemd/system/mysqld.service.

我们打开该文件看一下:

...
[Service]
User=mysql
Group=mysql

Type=forking

PIDFile=/var/run/mysqld/mysqld.pid

# Disable service start and stop timeout logic of systemd for mysqld service.
TimeoutSec=0

# Execute pre and post scripts as root
PermissionsStartOnly=true

# Needed to create system tables
ExecStartPre=/usr/bin/mysqld_pre_systemd

# Start main service
ExecStart=/usr/sbin/mysqld --daemonize --pid-file=/var/run/mysqld/mysqld.pid $MYSQLD_OPTS
          
# Use this to switch malloc implementation
EnvironmentFile=-/etc/sysconfig/mysql

# 1 Sets open_files_limit 
LimitNOFILE = 5000

注意这里的最后一行,

# 1 Sets open_files_limit 
LimitNOFILE = 5000

应该就是这个的问题了。

我这里改成10000,然后执行:

systemctl daemon-reload

然后重启mysql:

systemctl restart mysqld

重新查看最大资源限制:

[ CAD_OneKeyDeploy]# cat /proc/19617/limits 
Limit                     Soft Limit           Hard Limit           Units     
Max cpu time              unlimited            unlimited            seconds   
Max file size             unlimited            unlimited            bytes     
Max data size             unlimited            unlimited            bytes     
Max stack size            8388608              unlimited            bytes     
Max core file size        0                    unlimited            bytes     
Max resident set          unlimited            unlimited            bytes     
Max processes             3795                 3795                 processes 
Max open files            10000                10000                files

看最后一行,已经改成10000了。

重启后测试,依然是10000,说明修改成功了。

查看/var/log/mysql.log可以发现如下字样:

2020-07-18T03:48:11.235058Z 0 [Warning] Changed limits: max_open_files: 10000 (requested 50000)

这里,因为我们在systemd的service中限制成了10000,所以这里就显示成10000了。

总结

os级别的,必须改后重启;

ulimit方式,极度不推荐,只能是临时修改;

/etc/security/limits.conf 方式,用户级别,修改后,用户需重登陆shell,文件才生效;此后启动的进程才生效。

systemd方式的开机自启动进程,修改对应的service文件才有效。

相关推荐