[PATCH 1/3] fix setsid() for sub-namespace /sbin/init

Previous thread: [PATCH 2/3] teach set_special_pids() to use struct pid by Oleg Nesterov on Monday, November 26, 2007 - 7:25 am. (5 messages)

Next thread: [PATCH 1/3] [NET] phy/fixed.c: rework to not duplicate PHY layer functionality by Vitaly Bordug on Monday, November 26, 2007 - 7:29 am. (16 messages)
From: Oleg Nesterov
Date: Monday, November 26, 2007 - 7:25 am

sys_setsid() still deals with pid_t's from the global namespace. This means
that the "session > 1" check can't help for sub-namespace init, setsid() can't
succeed because copy_process(CLONE_NEWPID) populates PIDTYPE_PGID/SID links.

Remove the usage of task_struct->pid and convert the code to use "struct pid".
This also simplifies and speedups the code, saves one find_pid().

Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru>

--- PT/kernel/sys.c~1_setsid	2007-11-26 15:52:15.000000000 +0300
+++ PT/kernel/sys.c	2007-11-26 16:10:43.000000000 +0300
@@ -1045,35 +1045,33 @@ asmlinkage long sys_getsid(pid_t pid)
 asmlinkage long sys_setsid(void)
 {
 	struct task_struct *group_leader = current->group_leader;
-	pid_t session;
+	struct pid *sid = task_pid(group_leader);
+	pid_t session = pid_vnr(sid);
 	int err = -EPERM;
 
 	write_lock_irq(&tasklist_lock);
-
 	/* Fail if I am already a session leader */
 	if (group_leader->signal->leader)
 		goto out;
 
-	session = group_leader->pid;
-	/* Fail if a process group id already exists that equals the
-	 * proposed session id.
+	/* Fail if a process group id already exists that equals the proposed
+	 * session id.
 	 *
-	 * Don't check if session id == 1 because kernel threads use this
-	 * session id and so the check will always fail and make it so
-	 * init cannot successfully call setsid.
+	 * Don't check if session == 1 because kernel threads and CLONE_NEWPID
+	 * tasks use this session id and so the check will always fail and make
+	 * it so init cannot successfully call setsid.
 	 */
-	if (session > 1 && find_task_by_pid_type_ns(PIDTYPE_PGID,
-				session, &init_pid_ns))
+	if (session != 1 && pid_task(sid, PIDTYPE_PGID))
 		goto out;
 
 	group_leader->signal->leader = 1;
-	__set_special_pids(session, session);
+	__set_special_pids(pid_nr(sid), pid_nr(sid));
 
 	spin_lock(&group_leader->sighand->siglock);
 	group_leader->signal->tty = NULL;
 	spin_unlock(&group_leader->sighand->siglock);
 
-	err = ...
From: Oleg Nesterov
Date: Monday, November 26, 2007 - 7:43 am

Паша, глянь пжалста. Я не думаю, что это нужно для 2.6.24, bug (если я еще раз
не ошибся и он есть) очень мелкий, но все-таки...

Вопрос. вот у нас есть task_struct *p = find_task_by_vpid(pid), почему у нас нет
хелпера получить его pid_t ? task_pid_vnr() вернет не то, что нужно (что уже не
очень хорошо с точки зрения именования ;), и мы должны делать

	task_pid_nr_ns(p, current->nsproxy->pid_ns);

УЖОС!!! :-(

Oleg.

-

From: Pavel Emelyanov
Date: Monday, November 26, 2007 - 8:00 am

From: Eric W. Biederman
Date: Monday, November 26, 2007 - 12:16 pm

We can do even better.  We can remove the misguided code from
copy_process(CLONE_NEWPID) that populates the PIDTYPE_PGID/SID links
and generally does set setsid by hand, and the code from kernel_init
that call set_special_pid(), allowing us to remove the special case
entirely.

The set_special_pid() in kernel_init() and the special case check 
is actually a work around for the fact that earlier we could not
use 0 in the pid hash table.  Now that we can use init_struct_pid
directly we don't need the special case at all.

Eric
-

From: Oleg Nesterov
Date: Monday, November 26, 2007 - 1:11 pm

Yes you are right. IIRC there was a patch from you, but I didn't follow the
discussion, sorry, so I don't know what was the verdict.

If we remove that "almost setsid" from copy_process(), we can remove the fat

This is different, perhaps we can keep this call. kernel_thread(kernel_init)
attaches /sbin/init to init_struct_pid. Nothing bad, and a "good" init should
do setsid() anyway. But who knows? Some special environment may expect that
getpgrp() != 0. Not that I really disagree on this issue though.

Oleg.

-

From: Eric W. Biederman
Date: Monday, November 26, 2007 - 2:40 pm

Since session == pgrp == 0 is the historical start condition for /sbin/init there
is no problem from the session perspective, it in fact is better.

The only case that might have cared was setting si_pid when sending signals,
and it turns out it is both simple and necessary to handle that case across
namespaces anyway.


init starting with session == pgrp == 0 is historical linux behavior.  I consider
the current 2.6 behavior a temporary aberation from the historical linux behavior.

sysvinit does call setsid.  And nothing really bad will happen if someone forgets
to call setsid, in some obscure version of init.

Plus once we do this the code will be easier to maintain because we have
removed one obscure special case.

Eric
-

From: Oleg Nesterov
Date: Monday, November 26, 2007 - 3:46 pm

Yes indeed. So we can remove this special case code as soon as copy_process()
is changed.

Oleg.

-

Previous thread: [PATCH 2/3] teach set_special_pids() to use struct pid by Oleg Nesterov on Monday, November 26, 2007 - 7:25 am. (5 messages)

Next thread: [PATCH 1/3] [NET] phy/fixed.c: rework to not duplicate PHY layer functionality by Vitaly Bordug on Monday, November 26, 2007 - 7:29 am. (16 messages)