sys_setsid() still deals with pid_t's from the global namespace. This means
that the "session > 1" check can't help for sub-namespace init, setsid() can't
succeed because copy_process(CLONE_NEWPID) populates PIDTYPE_PGID/SID links.
Remove the usage of task_struct->pid and convert the code to use "struct pid".
This also simplifies and speedups the code, saves one find_pid().
Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru>
--- PT/kernel/sys.c~1_setsid 2007-11-26 15:52:15.000000000 +0300
+++ PT/kernel/sys.c 2007-11-26 16:10:43.000000000 +0300
@@ -1045,35 +1045,33 @@ asmlinkage long sys_getsid(pid_t pid)
asmlinkage long sys_setsid(void)
{
struct task_struct *group_leader = current->group_leader;
- pid_t session;
+ struct pid *sid = task_pid(group_leader);
+ pid_t session = pid_vnr(sid);
int err = -EPERM;
write_lock_irq(&tasklist_lock);
-
/* Fail if I am already a session leader */
if (group_leader->signal->leader)
goto out;
- session = group_leader->pid;
- /* Fail if a process group id already exists that equals the
- * proposed session id.
+ /* Fail if a process group id already exists that equals the proposed
+ * session id.
*
- * Don't check if session id == 1 because kernel threads use this
- * session id and so the check will always fail and make it so
- * init cannot successfully call setsid.
+ * Don't check if session == 1 because kernel threads and CLONE_NEWPID
+ * tasks use this session id and so the check will always fail and make
+ * it so init cannot successfully call setsid.
*/
- if (session > 1 && find_task_by_pid_type_ns(PIDTYPE_PGID,
- session, &init_pid_ns))
+ if (session != 1 && pid_task(sid, PIDTYPE_PGID))
goto out;
group_leader->signal->leader = 1;
- __set_special_pids(session, session);
+ __set_special_pids(pid_nr(sid), pid_nr(sid));
spin_lock(&group_leader->sighand->siglock);
group_leader->signal->tty = NULL;
spin_unlock(&group_leader->sighand->siglock);
- err = ...Паша, глянь пжалста. Я не думаю, что это нужно для 2.6.24, bug (если я еще раз не ошибся и он есть) очень мелкий, но все-таки... Вопрос. вот у нас есть task_struct *p = find_task_by_vpid(pid), почему у нас нет хелпера получить его pid_t ? task_pid_vnr() вернет не то, что нужно (что уже не очень хорошо с точки зрения именования ;), и мы должны делать task_pid_nr_ns(p, current->nsproxy->pid_ns); УЖОС!!! :-( Oleg. -
We can do even better. We can remove the misguided code from copy_process(CLONE_NEWPID) that populates the PIDTYPE_PGID/SID links and generally does set setsid by hand, and the code from kernel_init that call set_special_pid(), allowing us to remove the special case entirely. The set_special_pid() in kernel_init() and the special case check is actually a work around for the fact that earlier we could not use 0 in the pid hash table. Now that we can use init_struct_pid directly we don't need the special case at all. Eric -
Yes you are right. IIRC there was a patch from you, but I didn't follow the discussion, sorry, so I don't know what was the verdict. If we remove that "almost setsid" from copy_process(), we can remove the fat This is different, perhaps we can keep this call. kernel_thread(kernel_init) attaches /sbin/init to init_struct_pid. Nothing bad, and a "good" init should do setsid() anyway. But who knows? Some special environment may expect that getpgrp() != 0. Not that I really disagree on this issue though. Oleg. -
Since session == pgrp == 0 is the historical start condition for /sbin/init there is no problem from the session perspective, it in fact is better. The only case that might have cared was setting si_pid when sending signals, and it turns out it is both simple and necessary to handle that case across namespaces anyway. init starting with session == pgrp == 0 is historical linux behavior. I consider the current 2.6 behavior a temporary aberation from the historical linux behavior. sysvinit does call setsid. And nothing really bad will happen if someone forgets to call setsid, in some obscure version of init. Plus once we do this the code will be easier to maintain because we have removed one obscure special case. Eric -
Yes indeed. So we can remove this special case code as soon as copy_process() is changed. Oleg. -
