TCP最大监听队列修改

修改somaxconn参数值

该内核参数默认值一般是128(定义了系统中每一个端口最大的监听队列的长度),对于负载很大的服务程序来说大大的不够。一般会将它修改为2048或者更大。

1
echo 2048 > /proc/sys/net/core/somaxconn    # 临时修改,立马生效;系统重启丢失

在/etc/sysctl.conf中添加如下

1
net.core.somaxconn = 2048

然后在终端中执行

1
sysctl -p

redis (overcommit_memory)WARNING

redis 有时background save db不成功,log发现下面的告警,很可能由它引起的:

1
17427:M 17 Sep 2019 10:54:12.730 # WARNING overcommit_memory is set to 0! Background save may fail under low memory condition. To fix this issue add 'vm.overcommit_memory = 1' to /etc/sysctl.conf and then reboot or run the command 'sysctl vm.overcommit_memory=1' for this to take effect.

看到这个顺道去查了下,发现

内核参数overcommit_memory

它是内存分配策略,可选值:0、1、2。

0 – 表示内核将检查是否有足够的可用内存供应用进程使用;如果有足够的可用内存,内存申请允许;否则,内存申请失败,并把错误返回给应用进程。
1 – 表示内核允许分配所有的物理内存,而不管当前的内存状态如何。
2 – 表示内核允许分配超过所有物理内存和交换空间总和的内存

Overcommit 与 OOM

Linux对大部分申请内存的请求都回复”yes”,以便能跑更多更大的程序。因为申请内存后,并不会马上使用内存。这种技术叫做Overcommit。当linux发现内存不足时,会发生OOM killer(OOM=out-of-memory)。它会选择杀死一些进程(用户态进程,不是内核线程),以便释放内存。

当oom-killer发生时,linux会选择杀死哪些进程?选择进程的函数是oom_badness函数(在mm/oom_kill.c中),该函数会计算每个进程的点数(0~1000)。点数越高,这个进程越有可能被杀死。每个进程的点数跟oom_score_adj有关,而且oom_score_adj可以被设置(-1000最低,1000最高)。

解决办法

同样是修改内核参数:

1
echo 1 > /proc/sys/vm/overcommit_memory # 临时修改,立马生效;系统重启丢失

编辑/etc/sysctl.conf ,新增下面一行,然后sysctl -p 使配置文件生效

1
vm.overcommit_memory=1

redis (Transparent Huge Pages) WARNING

redis 在centos 7 启动后有警告

1
17427:M 17 Sep 2019 10:54:12.730 # WARNING you have Transparent Huge Pages (THP) support enabled in your kernel. This will create latency and memory usage issues with Redis. To fix this issue run the command 'echo never > /sys/kernel/mm/transparent_hugepage/enabled' as root, and add it to your /etc/rc.local in order to retain the setting after a reboot. Redis must be restarted after THP is disabled.

Transparent Huge Pages的一些官方介绍

The kernel will always attempt to satisfy a memory allocation using huge pages. If no huge pages are available (due to non availability of physically continuous memory for example) the kernel will fall back to the regular 4KB pages. THP are also swappable (unlike hugetlbfs). This is achieved by breaking the huge page to smaller 4KB pages, which are then swapped out normally.

禁止透明大页

1
2
[root@localhost 6379]# cat /sys/kernel/mm/transparent_hugepage/enabled
always madvise [never]

默认开启的,因此出现找不到文件,就是开启的

打开文件数的限制

用户级的文件数限制, 可以通过 ulimit -n 来查看

1
2
[root@localhost opt]# ulimit -n    # 查看当前用户能够打开的最大文件数
1024

而系统级别的文件数限制,则通过sysctl -a来查看

1
2
[root@localhost opt]# sysctl -a | grep file-max
fs.file-max = 284775

一般系统最大文件数会根据硬件资源计算出来的,如果强行需要修改最大打开文件数可以通过ulimit -n 10240来修改,当这种方式只对当前进程有效,如果需要永久有效则需要修改/etc/security/limits.conf(重启系统生效)

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
#<type> can have the two values:
# - "soft" for enforcing the soft limits
# - "hard" for enforcing hard limits
#
#<item> can be one of the following:
# - core - limits the core file size (KB)
# - data - max data size (KB)
# - fsize - maximum filesize (KB)
# - memlock - max locked-in-memory address space (KB)
# - nofile - max number of open file descriptors
# - rss - max resident set size (KB)
# - stack - max stack size (KB)
# - cpu - max CPU time (MIN)
# - nproc - max number of processes
# - as - address space limit (KB)
# - maxlogins - max number of logins for this user
# - maxsyslogins - max number of logins on the system
# - priority - the priority to run user process with
# - locks - max number of file locks the user can hold
# - sigpending - max number of pending signals
# - msgqueue - max memory used by POSIX message queues (bytes)
# - nice - max nice priority allowed to raise to values: [-20, 19]
# - rtprio - max realtime priority
#
#<domain> <type> <item> <value>
#

#* soft core 0
#* hard rss 10000
#@student hard nproc 20
#@faculty soft nproc 20
#@faculty hard nproc 50
#ftp hard nproc 0
#@student - maxlogins 4

# End of file

root soft nofile 65535 # 新增
root hard nofile 65535

各列意义:

1: 用户名称,对所有用户则“*”

2:soft 软限制/hard 硬件限制

3: 代表最大文件打开数

4: 数量

查看所有进程文件打开数
1
lsof | wc -l
查看某个进程打开文件数
1
lsof  -p [pid] | wc -l
查看系统中各个进程分别打开了多少句柄数
1
lsof -n|awk '{print $2}'|sort|uniq -c|sort -nr|more

SSH 密钥登录

  1. 生成公钥私钥对,linux下使用 ssh-keygen 命令生成,一路enter
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
[root@localhost ~]# ssh-keygen 
Generating public/private rsa key pair.
Enter file in which to save the key (/root/.ssh/id_rsa): # 存放路径
Enter passphrase (empty for no passphrase): # 密码
Enter same passphrase again:
Your identification has been saved in /root/.ssh/id_rsa.
Your public key has been saved in /root/.ssh/id_rsa.pub.
The key fingerprint is:
SHA256:N9Q7EJosUhKrKw+Utf5jcSgpg7fzJS5EtVfa5iinWOk root@localhost.localdomain
The key's randomart image is:
+---[RSA 2048]----+
| o.. . |
| .+ ..o o |
| oo..++ o . |
| +.o.o.o. . . |
|.+...o.+S o o |
|oo+++oo... . . |
|oo+*o++ |
| +=.E= |
| .++.. |
+----[SHA256]-----+
  1. 生成密钥对后,将公钥上传到服务器要登陆的用户 ~/.ssh/authorized_keys
1
2
[root@localhost ~]# cat ~/.ssh/id_rsa.pub 
ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQCyUOANXFleDWrpJop7zrM2eZxrna8+uIiSbz72IRsAO3intbNkyHJpULv9yYUmT4iPf4Vn4QYmyogFhdpTktKSADBbyH5VXIGyFjbWeO7ix1iVr7YVQQ/4P/nELVytCUiIojFdZ+DvyYSariLzLuliFpYTMJ4jpmgpL/pAUobEazpGwjlRUOWik3+8kLGpsxHYJNUmrZKSnNaOYqDJVGfO3KBfozO+I5B/wcwSW5hje7Y5xyfdDzvuVh7uVmKQbjw3WoMiy64pTcKB1S3tQtPZfXnmOd3tUZU8SXSfcvhHdgrbG6kFBrJwqjpj/sE4zL9nWbSZlbpJD7gXtlkdrAH1 root@localhost.localdomain
  1. 登录服务器 ~/.ssh/authorized_keys
1
2
[root@wpspic6 opt]# cat ~/.ssh/authorized_keys
sh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQCyUOANXFleDWrpJop7zrM2eZxrna8+uIiSbz72IRsAO3intbNkyHJpULv9yYUmT4iPf4Vn4QYmyogFhdpTktKSADBbyH5VXIGyFjbWeO7ix1iVr7YVQQ/4P/nELVytCUiIojFdZ+DvyYSariLzLuliFpYTMJ4jpmgpL/pAUobEazpGwjlRUOWik3+8kLGpsxHYJNUmrZKSnNaOYqDJVGfO3KBfozO+I5B/wcwSW5hje7Y5xyfdDzvuVh7uVmKQbjw3WoMiy64pTcKB1S3tQtPZfXnmOd3tUZU8SXSfcvhHdgrbG6kFBrJwqjpj/sE4zL9nWbSZlbpJD7gXtlkdrAH1 root@localhost.localdomain
  1. 测试登录,这里我用的是root

    1
    2
    [root@localhost opt]# ssh root@10.226.50.30 
    Last login: Tue Sep 17 16:23:57 2019 from 10.226.28.68 # 登录成功

    增加-v选项,输出登录过程

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    27
    28
    29
    30
    31
    32
    33
    34
    35
    36
    37
    38
    39
    40
    41
    42
    43
    44
    45
    46
    47
    48
    49
    50
    51
    52
    53
    54
    55
    56
    57
    58
    59
    60
    61
    62
    63
    64
    65
    66
    67
    68
    69
    70
    71
    [root@localhost opt]# ssh root@10.226.50.30 -v
    OpenSSH_7.4p1, OpenSSL 1.0.2k-fips 26 Jan 2017
    debug1: Reading configuration data /etc/ssh/ssh_config
    debug1: /etc/ssh/ssh_config line 58: Applying options for *
    debug1: Connecting to 10.226.50.30 [10.226.50.30] port 22.
    debug1: Connection established.
    debug1: permanently_set_uid: 0/0
    debug1: identity file /root/.ssh/id_rsa type 1
    debug1: key_load_public: No such file or directory
    debug1: identity file /root/.ssh/id_rsa-cert type -1
    debug1: key_load_public: No such file or directory
    debug1: identity file /root/.ssh/id_dsa type -1
    debug1: key_load_public: No such file or directory
    debug1: identity file /root/.ssh/id_dsa-cert type -1
    debug1: key_load_public: No such file or directory
    debug1: identity file /root/.ssh/id_ecdsa type -1
    debug1: key_load_public: No such file or directory
    debug1: identity file /root/.ssh/id_ecdsa-cert type -1
    debug1: key_load_public: No such file or directory
    debug1: identity file /root/.ssh/id_ed25519 type -1
    debug1: key_load_public: No such file or directory
    debug1: identity file /root/.ssh/id_ed25519-cert type -1
    debug1: Enabling compatibility mode for protocol 2.0
    debug1: Local version string SSH-2.0-OpenSSH_7.4
    debug1: Remote protocol version 2.0, remote software version OpenSSH_7.4
    debug1: match: OpenSSH_7.4 pat OpenSSH* compat 0x04000000
    debug1: Authenticating to 10.226.50.30:22 as 'root'
    debug1: SSH2_MSG_KEXINIT sent
    debug1: SSH2_MSG_KEXINIT received
    debug1: kex: algorithm: curve25519-sha256
    debug1: kex: host key algorithm: rsa-sha2-512
    debug1: kex: server->client cipher: chacha20-poly1305@openssh.com MAC: <implicit> compression: none
    debug1: kex: client->server cipher: chacha20-poly1305@openssh.com MAC: <implicit> compression: none
    debug1: kex: curve25519-sha256 need=64 dh_need=64
    debug1: kex: curve25519-sha256 need=64 dh_need=64
    debug1: expecting SSH2_MSG_KEX_ECDH_REPLY
    debug1: Server host key: ssh-rsa SHA256:yKoRmi5QgIlXrrhYQcP5W0Mx2PhSoTsm5Z+DhdeYFpU
    debug1: Host '10.226.50.30' is known and matches the RSA host key.
    debug1: Found key in /root/.ssh/known_hosts:1
    debug1: rekey after 134217728 blocks
    debug1: SSH2_MSG_NEWKEYS sent
    debug1: expecting SSH2_MSG_NEWKEYS
    debug1: SSH2_MSG_NEWKEYS received
    debug1: rekey after 134217728 blocks
    debug1: SSH2_MSG_EXT_INFO received
    debug1: kex_input_ext_info: server-sig-algs=<rsa-sha2-256,rsa-sha2-512>
    debug1: SSH2_MSG_SERVICE_ACCEPT received
    debug1: Authentications that can continue: publickey,gssapi-keyex,gssapi-with-mic,password
    debug1: Next authentication method: gssapi-keyex
    debug1: No valid Key exchange context
    debug1: Next authentication method: gssapi-with-mic
    debug1: Unspecified GSS failure. Minor code may provide more information
    No Kerberos credentials available (default cache: KEYRING:persistent:0)

    debug1: Unspecified GSS failure. Minor code may provide more information
    No Kerberos credentials available (default cache: KEYRING:persistent:0)

    debug1: Next authentication method: publickey
    debug1: Offering RSA public key: /root/.ssh/id_rsa
    debug1: Server accepts key: pkalg rsa-sha2-512 blen 279
    debug1: Authentication succeeded (publickey).
    Authenticated to 10.226.50.30 ([10.226.50.30]:22).
    debug1: channel 0: new [client-session]
    debug1: Requesting no-more-sessions@openssh.com
    debug1: Entering interactive session.
    debug1: pledge: network
    debug1: client_input_global_request: rtype hostkeys-00@openssh.com want_reply 0
    debug1: Sending environment.
    debug1: Sending env LANG = en_US.UTF-8
    Last login: Wed Sep 18 12:00:12 2019 from 10.226.50.31
    [root@wpspic6 ~]#
  1. 登录可能失败,可以在ssh命令后增加-v选项查看登录过程;
  2. 如果私钥是从其他地方拷贝,最好将id_rsa.pub也拷贝或者原来的删除,否则会影响登录;