On Monday 16 April 2007 18:21, Christoph Hellwig wrote:Yes, the vfs uses or passes through a NULL nameidata or vfsmount pointer and no intent information in several places: (1) vfs_create() is called with a NULL nameidata in two places, (2) lookup_one_len() and lookup_one_len_kern() pass a NULL nameidata to permission(), d_op->d_revalidate(), and i_op->lookup(), (3) file_permission() passes a NULL nameidata to permission(). (4) permission() is directly called with NULL nameidata in some places, and In general it is incorrect to use NULL nameidata or vfsmounts because then the vfsmount->mnt_flags cannot be checked, and no intent information is available. Intents are an optional optimization for remote filesystems; we can simply ignore them for local filesystems. There are some cases where we are passing NULL which are real bugs, but there are also other cases in which the operations are not mount related: for example, filesystems like pipefs have the MS_NOUSER flag set which indicates that they cannot be mounted at all. On filesystems like sysfs, files are created and removed whether or not that filesystem is mounted. The places where we should pass a proper nameidata / vfsmount but don't should be fixed. Those are long-standing issues in the vfs though, and at least some are not easy to correct. AppArmor is affected by some of the NULL vfsmount passing, and we found a few more problems when looking into this more closely, but most of that NULL passing doesn't affect AppArmor: * In some places, vfs functions are called without nameidata / vfsmount, followed by calling lsm hooks with a vfsmount. In those cases, the vfs functions cannot check the vfsmount flags, but the lsm hooks will still do the appropriate checks. * In the AppArmor security model, directory searching is always allowed, and the access check are done once the leaf dentry + vfsmount has been looked up. For example, granting write access to /var/tmp/foo in a profile is sufficient to allow that access; search access to /var and /var/tmp does not need to be specified. (The regular inode permission checks are of course still performed.) * For filesystems like nfsd, the concept of pathname based access control doesn't make sense because of properties of the protocol: there is no reliable mapping from an nfs file handle to a particular file; aliases to the same file cannot be distinguished. (The AppArmor techdoc [1] describes this in more detail.) Not allowing to confine nfsd by AppArmor doesn't hurt much: nfsd provides its own export access control mechanisms. Also, nfsd cannot really be confined in a secure way because of how it is implemented in the kernel. Gfs2 may have similar properties; I didn't actually look into the protocol. So I propose to separate the vfs NULL nameidata / vfsmount passing discussion from the AppArmor discussion. I have no problem working on the vfs fixes -- we could some nice cleanups there -- but that really is independent of AppArmor. We almost have the revised AppArmor patches ready. We'll include all the fixes that are truly necessary, and everything beyond that will be posted separately. Splitting up nameidata helps somewhat here. In cases where we don't have intent information available we can zero out the intent flags, and so in the worst case, we won't get the intent optimizations. Thanks, Andreas -
| Greg KH | Og dreams of kernels |
| Jens Axboe | [PATCH 31/33] Fusion: sg chaining support |
| Arnd Bergmann | Re: finding your own dead "CONFIG_" variables |
| Mark Brown | [PATCH 2/2] Subject: natsemi: Allow users to disable workaround for DspCfg reset |
| Tony Breeds | [LGUEST] Look in object dir for .config |
git: | |
| Brian Downing |
