记录日常工作关于系统运维,虚拟化云计算,数据库,网络安全等各方面问题。
tag@cwtdb2 ~]$ free -g

-bash: fork: Cannot allocate memory


There is also /proc/sys/kernel/pid_max file, which specifies the value at which PIDs wrap around (i.e., the value in this file is one greater than the maximum PID).

The default value for this file, 32768, results in the same range of PIDs as on earlier kernels (<=2.4). The value in this file can be set to any value up to 2^22 (PID_MAX_LIMIT, approximately 4 million).
Increasing the value will help on large Linux system or clusters to ease process identification and process management. You can easily prevent fork() failures error message with this hack.

Display Current Process Identifiers Limit On a Linux Based Systems

Type the following command at shell prompt:
$ sysctl kernel.pid_max
OR
$ cat /proc/sys/kernel/pid_max
OR
$ sysctl kernel.pid_max
Sample outputs:

kernel.pid_max = 32768

Allow for more PIDs on a Linux based systems

up to 222 = 4,194,304

Type the following command:
# sysctl -w kernel.pid_max=4194303
OR
# echo 4194303 > /proc/sys/kernel/pid_max

You need to append the following config directive to your /etc/sysctl.conf file:
kernel.pid_max = 4194303

Please note that this hack is only useful for a large and busy server; don’t try this on an old kernel or on desktop systems.

See also


同样的两台服务器,相同的OS版本、内核版本、CPU型号、CPU核数,只是厂家不同,但是机器启动后sysctl里的kernel.pid_max值,一台是128k,一台是32k。看了一下/etc/sysctl.conf,两台都没在配置文件里做更改,应该是内核自己选定的默认值。那内核到底是怎样选定这个默认值的呢?为何两个厂家的服务器默认值就不同?怎么让它们一致?

看了一下内核代码,决定kernel.pid_max的值是在pidmap_init()函数里:

int pid_max = PID_MAX_DEFAULT;
......
void __init pidmap_init(void)
{       
        /* bump default and minimum pid_max based on number of cpus */
        pid_max = min(pid_max_max, max_t(int, pid_max, 
                                PIDS_PER_CPU_DEFAULT * num_possible_cpus()));
......

其中,PIDS_PER_CPU_DEFAULT值为1024,也就是,内核基本上认为一个CPU核差不多最大跑1024个task,至于num_possible_cpus(),是通过计算cpu_possible_mask

const struct cpumask *const cpu_possible_mask

这个结构里被置为1的bit数来确定possible cpus的,即可用的最高CPU核数。possible cpu这个概念是为热插拔CPU准备的,比如,一台机器一共可以插24个CPU核,但是目前只插了12个,那么possible cpu核数应该是24,pid_max应该为“没插的cpu核“做预备,应该是24k。但是pid_max的默认值PID_MAX_DEFAULT是32k,比24k大,所以按代码,应该是选32k为值。

那是谁设置了cpu_possible_mask里的这些bit呢?再看看内核启动的函数:

asmlinkage void __init start_kernel(void)                                     
{                                     
        char * command_line;
        extern struct kernel_param __start___param[], __stop___param[];
......
        setup_arch(&command_line);
......
        pidmap_init();
......
}

里面是在setup_arch()里做了一些bit设置的事情:

setup_arch() --> prefill_possible_map()

__init void prefill_possible_map(void)                       
{      
        int i, possible;
                                      
        /* no processor from mptable or madt */
        if (!num_processors)
                num_processors = 1;
                                                
        printk(KERN_INFO "num_processors: %u, disabled_cpus: %u",
                          num_processors, disabled_cpus);
        if (setup_possible_cpus == -1)
                possible = num_processors + disabled_cpus;       
        else                                             
                possible = setup_possible_cpus;
                                                          
        total_cpus = max_t(int, possible, num_processors + disabled_cpus);
                                               
        /* nr_cpu_ids could be reduced via nr_cpus= */
        if (possible > nr_cpu_ids) {                                      
                printk(KERN_WARNING
                        "%d Processors exceeds NR_CPUS limit of %d\n",
                        possible, nr_cpu_ids);
                possible = nr_cpu_ids;
        }                                                             
                                              
        printk(KERN_INFO "SMP: Allowing %d CPUs, %d hotplug CPUs\n",
                possible, max_t(int, possible - num_processors, 0));

        for (i = 0; i < possible; i++)                              
                set_cpu_possible(i, true);                          

        nr_cpu_ids = possible;        
}                                         

setup_possible_cpus的值默认是-1,所以是根据ACPI驱动返回的num_processors和disabled_cpus的和来确定possible cpu数的。不同厂商的ACPI返回的disabled_cpus是不同的,所以possible cpu核数不同,自然kernel.pid_max值也不同。在系统启动的日志里可以看disabled cpu的不同:

bash# dmesg|grep Allowing
[    0.000000] SMP: Allowing 24 CPUs, 8 hotplug CPUs

说明当前CPU核数为16(24减去8),disabled的CPU核数为8。
那,怎么解决这个问题呢?怎样统一所有机型的默认kernel.pid_max值呢?还是得注意setup_possible_cpus这个值,这个值是可以通过grub来改的,只要在grub命令里kernel那一行的后面加上 possible_cpus=128 就可以把possible cpu数都统一成128了,参考这篇文档。当然,其实最方便的还是改/etc/sysctl.conf。




转载请标明出处【RHEL6.5/Centos6.5 报错:fork: Cannot allocate memory】。

《www.micoder.cc》 虚拟化云计算,系统运维,安全技术服务.

网站已经关闭评论