出现502 bad gateway错误的原因
1.php-fpm进程数不够用
2.Linux内核打开文件数量小 3.脚本执行时间超时 4.缓存设置比较小网站间歇性出现502,第一反应不是程序的问题,而是nginx服务器的问题,因为这是代理服务器出现的问题,代理服务器并没有安装php 排除第一中情况。
于此想到的是可能是超时,所以我把超时修改了一些
一下是服务器原配置(重点配置)http部分
server_names_hash_bucket_size 64;
client_header_buffer_size 128k; large_client_header_buffers 4 32k; client_max_body_size 50m;keepalive_timeout 60;
fastcgi_connect_timeout 60; fastcgi_send_timeout 60; fastcgi_read_timeout 600; fastcgi_buffer_size 64k; fastcgi_buffers 4 128k; fastcgi_busy_buffers_size 128k; fastcgi_temp_file_write_size 256k;gzip_buffers 4 128k;
server部分
upstream myweb {
server 10.10.10.1:80 max_fails=3 fail_timeout=30s; server 10.10.10.2:80 max_fails=3 fail_timeout=30s; ip_hash; }location / {
proxy_pass http://myweb; proxy_next_upstream error timeout invalid_header http_500 http_502 http_503; proxy_set_header Host $host; proxy_set_header X-Real-IP $remote_addr; proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for; proxy_set_header X-Forwarded-Proto https; proxy_redirect off; }
第一步修改超时 只改了http部分 缓冲基本上都加了几倍
server_names_hash_bucket_size 512;
client_header_buffer_size 512k; large_client_header_buffers 16 128k; client_max_body_size 256m;keepalive_timeout 600;
fastcgi_connect_timeout 600; fastcgi_send_timeout 600; fastcgi_read_timeout 600; fastcgi_buffer_size 256k; fastcgi_buffers 16 512k; fastcgi_busy_buffers_size 512k; fastcgi_temp_file_write_size 1024k;gzip_buffers 16 512k;
观察nginx出现502的频率并没有下降,还是和以前一样
第二步修改server代理服务超时
location / {
proxy_pass http://myweb; proxy_next_upstream error timeout invalid_header http_500 http_502 http_503; proxy_set_header Host $host; proxy_set_header X-Real-IP $remote_addr; proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for; proxy_set_header X-Forwarded-Proto https; proxy_redirect off;proxy_connect_timeout 300s;
proxy_send_timeout 300s; proxy_read_timeout 300s;}
观察nginx502的频率有一点下降,没有达到预期效果于是更改代理的缓冲区
location / {
proxy_pass http://myweb; proxy_next_upstream error timeout invalid_header http_500 http_502 http_503; proxy_set_header Host $host; proxy_set_header X-Real-IP $remote_addr; proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for; proxy_set_header X-Forwarded-Proto http; proxy_redirect off;proxy_connect_timeout 300s;proxy_send_timeout 300s;proxy_read_timeout 300s;proxy_buffer_size 512k;proxy_buffers 32 512k;proxy_busy_buffers_size 512k;proxy_temp_file_write_size 512k;proxy_ignore_client_abort on;}
观察nginx502的频率和刚才一样,并没有明显效果。打开nginx的错误日志,观察错误状态,nginx错误日志显示
[error] 20435#0: *3890606 no live upstreams while connecting to upstream, client:
意思是nginx发现没有存活的后端了,后端有两台服务器,这怎么可能
猜想nginx在等待后端服务器返回的时候做了判断,如果后端服务器响应慢就有可能踢掉后端服务器,因此就可能把后端的两台服务器都踢掉
所以问题是出在 upstream 配置,原来配置中的max_fails=3 fail_timeout=30s;是默认的配置
我尝试修改 max_fails=10 fail_timeout=60s; 观察nginx出现502的频率下降了很多,但是如果出现502就时间比较久。以下是upstream最终配置
upstream myweb {
server 10.10.10.1:80 max_fails=60 fail_timeout=10s;server 10.10.10.2:80 max_fails=60 fail_timeout=10s; ip_hash; }接下来可能的优化 调高调高linux内核打开文件数量备注一下
echo ‘ulimit -HSn 65536′ >> /etc/profile
echo ‘ulimit -HSn 65536′ >> /etc/rc.local source /etc/profile