Ulrich Drepper noted a difference between the Linux connect(2) man page and the POSIX specification. The former states, "connectionless sockets may dissolve the association by connecting to an address with the sa_family member of sockaddr set to AF_UNSPEC." The latter reads, "if address is a null address for the protocol, the socket's peer address shall be reset." Ulrich explained that he preferred the description in the Linux man page, but the Linux kernel seems to actually follow the POSIX specification, "is this functionality which got lost over time? Or is the man page wrong and this never was the case? Is this a worthwhile change?"
Alan Cox noted, "we got it from the 1003.4g draft socket specification if I remember rightly." David Miller suggested, "the whole AF_UNSPEC thing I'm almost certain comes from BSD, which has behaved that way for centuries." Alan concurred, "its entirely plausible that [the 1003.4g draft socket specification] got it from 4BSE." Ulrich concluded, "I guess I'll just go ahead and file a problem report with the spec. Maybe the Unix vendors will test their implementations and provide feedback."
From: Ulrich Drepper <drepper@...> Subject: drop association of connection-less socket Date: Sep 19, 3:28 am 2007 -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 The Linux man page for connect(2) currently says: Connectionless sockets may dissolve the association by connecting to an address with the sa_family member of sockaddr set to AF_UNSPEC. No such wording is in the POSIX definition which only says If address is a null address for the protocol, the socket’s peer address shall be reset. This is not the same but seems to be what Linux implements. The problem is that I tried to reuse a socket which has been associated with an IPv6 address to later connect to an IPv4 address. This is part of the getaddrinfo implementation and an effort to make it more efficient. strace's output looks like this: connect(3, {sa_family=AF_INET6, sin6_port=htons(0), inet_pton(AF_INET6, "2001:11b8:1:0:207:e94f:ee7c:4b72", &sin6_addr), sin6_flowinfo=0, sin6_scope_id=0}, 28) = -1 ENETUNREACH (Network is unreachable) connect(3, {sa_family=AF_UNSPEC, sa_data="[mid=263382][mid=263382][mid=263382][mid=263382][mid=263382][mid=263382][mid=263382][mid=263382][mid=263382][mid=263382][mid=263382][mid=263382][mid=263382][mid=263382]"}, 28) = 0 connect(3, {sa_family=AF_INET, sin_port=htons(0), sin_addr=inet_addr("192.168.1.72")}, 16) = 0 I.e., despite what the man page says, the second connect only reset the address, as required by the POSIX spec. It did not reset the address family of the socket. What I ideally would like to see is what the Linux man page says. I.e., if the .sa_family field is AF_UNSPEC all, the address and address family, is reset. Otherwise only the address association itself is reset. Is this functionality which got lost over time? Or is the man page wrong and this never was the case? Is this a worthwhile change? - -- ➧ Ulrich Drepper ➧ Red Hat, Inc. ➧ 444 Castro St ➧ Mountain View, CA ❖ -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.7 (GNU/Linux) iD8DBQFG8M+52ijCOnn/RHQRAnTEAJ0Z/DrTkcCjpbybB5lqDad9Z0MbZwCeLZOh u/mNfxV7uDjRsSuOj4YwuIg= =FO70 -----END PGP SIGNATURE----- - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
From: Ulrich Drepper <drepper@...> Subject: follow-up: discrepancy with POSIX Date: Sep 19, 11:21 am 2007 -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 As a follow up to my question from yesterday on the netdev list what I think is a real problem. Either in the kernel or in the POSIX spec. The POSIX spec currently says this about SOCK_DGRAM sockets: If address is a null address for the protocol, the socket’s peer address shall be reset. The term "null address" is not further specified but it will usually be read to allow the following scenario to work out: fd = socket(AT_INET6, ...) connect(fd, ...some IPv6 address...) struct sockaddr_in6 sin6 = { .sin6_family = AF_INET6 }; connect(fd, &sin6, sizeof (sin6)); connect(fd, ...some new IPv6 address...) This does not work on Linux in the moment. The socket remains connected to the old IPv6 address but the second connect() call does succeed (this does not sound OK). What does work is if the connect call to disassociate the address uses AF_UNSPEC instead of AF_INET6. The question is: do people here think this is a problem in the POSIX spec? Binding to :: and 0.0.0.0 isn't possible, so maybe the Linux implementation should allow this? If you think the POSIX spec is wrong (and can point to other implementations doing the same as Linux) let me know and I'll work on getting the spec changed. - -- ➧ Ulrich Drepper ➧ Red Hat, Inc. ➧ 444 Castro St ➧ Mountain View, CA ❖ -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.7 (GNU/Linux) iD8DBQFG8T6L2ijCOnn/RHQRAnSRAJ9sXDGG9OepEQWQInaPgwxCWlaH6wCghqim ULttg5/lU8c1rSpBnoRCjB8= =nGVv -----END PGP SIGNATURE----- -
From: David Miller <davem@...> Subject: Re: follow-up: discrepancy with POSIX Date: Sep 19, 12:15 pm 2007 From: Ulrich Drepper <drepper@redhat.com> Date: Wed, 19 Sep 2007 08:21:47 -0700 > If you think the POSIX spec is wrong (and can point to other > implementations doing the same as Linux) let me know and I'll work on > getting the spec changed. The whole AF_UNSPEC thing I'm almost certain comes from BSD, which has behaved that way for centuries. Someone needs to cull through Steven's Volume 2 to verify this, I'm too busy at the moment to do so myself. -
From: Alan Cox <alan@...> Subject: Re: follow-up: discrepancy with POSIX Date: Sep 19, 12:39 pm 2007 On Wed, 19 Sep 2007 09:15:10 -0700 (PDT) David Miller <davem@davemloft.net> wrote: > From: Ulrich Drepper <drepper@redhat.com> > Date: Wed, 19 Sep 2007 08:21:47 -0700 > > > If you think the POSIX spec is wrong (and can point to other > > implementations doing the same as Linux) let me know and I'll work on > > getting the spec changed. > > The whole AF_UNSPEC thing I'm almost certain comes from BSD, which has > behaved that way for centuries. We got it from the 1003.4g draft socket specification if I remember rightly. Its entirely plausible that got it from 4BSE. Alan -
From: Andi Kleen <andi@...> Subject: Re: follow-up: discrepancy with POSIX Date: Sep 19, 11:47 am 2007 Ulrich Drepper <drepper@redhat.com> writes: > > fd = socket(AT_INET6, ...) > > connect(fd, ...some IPv6 address...) > > struct sockaddr_in6 sin6 = { .sin6_family = AF_INET6 }; > connect(fd, &sin6, sizeof (sin6)); The standard way to undo connect is to use AF_UNSPEC. Code to handle that for dgram sockets is there. It's the same code for v4 and v6. -Andi -
From: Ulrich Drepper <drepper@...> Subject: Re: follow-up: discrepancy with POSIX Date: Sep 19, 12:49 pm 2007 -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Andi Kleen wrote: > The standard way to undo connect is to use AF_UNSPEC. Code to handle > that for dgram sockets is there. It's the same code for v4 and v6. I quoted the standard and it does not say anything about AF_UNSPEC. So you cannot simply make such broad statements. I also don't say that this behavior should be removed. It's certainly useful, very much so in fact. But the spec calls for a "null address" to be used and that's in my understanding something different from using AF_UNSPEC. I looked through Stevens TCP Illustrated Vol 2 and it seems not to mention resetting the address at all. The POSIX spec certainly got this text from .1g. I cannot test it on other systems. If somebody has access to some certified systems (and maybe others), write a bit of code which creates a DGRAM socket, connect to one address, call connect with a "null address", then connect to another address (which likely has to use a different interface since otherwise the connect will just succeed, it seems). - -- ➧ Ulrich Drepper ➧ Red Hat, Inc. ➧ 444 Castro St ➧ Mountain View, CA ❖ -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.7 (GNU/Linux) iD8DBQFG8VMF2ijCOnn/RHQRAr9NAJwLxyql0kQnMGJNaPZlRGsuB6rGEACgog88 WIWAFhuBWsjps7PdbcoumUQ= =oLxP -----END PGP SIGNATURE----- -
From: Andi Kleen <andi@...> Subject: Re: follow-up: discrepancy with POSIX Date: Sep 19, 1:26 pm 2007 On Wed, Sep 19, 2007 at 09:49:09AM -0700, Ulrich Drepper wrote: > -----BEGIN PGP SIGNED MESSAGE----- > Hash: SHA1 > > Andi Kleen wrote: > > The standard way to undo connect is to use AF_UNSPEC. Code to handle > > that for dgram sockets is there. It's the same code for v4 and v6. > > I quoted the standard and it does not say anything about AF_UNSPEC. So > you cannot simply make such broad statements. Ok "standard" was perhaps a poor choice of words. AF_UNSPEC used to be introduced long ago by Alan based on some early POSIX draft iirc. Also incidentially it's a null address: include/linux/socket.h:#define AF_UNSPEC 0 > But the spec calls for a "null address" to be used and that's in my > understanding something different from using AF_UNSPEC. memset(&sockaddr, 0, sizeof(sockaddr)) should give you AF_UNSPEC -Andi -
From: Ulrich Drepper <drepper@...> Subject: Re: follow-up: discrepancy with POSIX Date: Sep 19, 1:46 pm 2007 -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Andi Kleen wrote: >> But the spec calls for a "null address" to be used and that's in my >> understanding something different from using AF_UNSPEC. > > memset(&sockaddr, 0, sizeof(sockaddr)) should give you AF_UNSPEC But the spec calls for <quote>null address for the protocol</quote>. That means the family for the null address is the same as the family of the socket. - -- ➧ Ulrich Drepper ➧ Red Hat, Inc. ➧ 444 Castro St ➧ Mountain View, CA ❖ -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.7 (GNU/Linux) iD8DBQFG8WCO2ijCOnn/RHQRAgtsAJ9qTFVj5QQbVG/hUflxo/6uPOfl4QCdHSX8 wi2GX7B0pht8VDaswYLqdpM= =sMSg -----END PGP SIGNATURE----- -
From: Andi Kleen <andi@...> Subject: Re: follow-up: discrepancy with POSIX Date: Sep 19, 1:57 pm 2007 On Wed, Sep 19, 2007 at 10:46:54AM -0700, Ulrich Drepper wrote: > -----BEGIN PGP SIGNED MESSAGE----- > Hash: SHA1 > > Andi Kleen wrote: > >> But the spec calls for a "null address" to be used and that's in my > >> understanding something different from using AF_UNSPEC. > > > > memset(&sockaddr, 0, sizeof(sockaddr)) should give you AF_UNSPEC > > But the spec calls for <quote>null address for the protocol</quote>. > > That means the family for the null address is the same as the family of > the socket. Spec doesn't match traditional behaviour then. IPv4 0.0.0.0 is traditionally an synonym for old style all broadcast (255.255.255.255) on UDP/RAW and it's certainly possible to connect() to that. -Andi -
From: Ulrich Drepper <drepper@...> Subject: Re: follow-up: discrepancy with POSIX Date: Sep 19, 2:02 pm 2007 -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Andi Kleen wrote: > Spec doesn't match traditional behaviour then. Well, determining whether that's the case is part of this exercise. > IPv4 0.0.0.0 is > traditionally an synonym for old style all broadcast (255.255.255.255) > on UDP/RAW and it's certainly possible to connect() to that. Where do you get this from? And where is this implemented? I don't doubt it but I have to convince people to change the standard and possibly introduce incompatibility. - -- ➧ Ulrich Drepper ➧ Red Hat, Inc. ➧ 444 Castro St ➧ Mountain View, CA ❖ -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.7 (GNU/Linux) iD8DBQFG8WQY2ijCOnn/RHQRAlsBAJ9qZRZXNN2VEy136MFIT1daHfju5ACdGiIW k0I5e2BGRjvjbJrrAwtehqo= =fX+i -----END PGP SIGNATURE----- -
From: Andi Kleen <andi@...> Subject: Re: follow-up: discrepancy with POSIX Date: Sep 19, 2:30 pm 2007 On Wed, Sep 19, 2007 at 11:02:00AM -0700, Ulrich Drepper wrote: > > on UDP/RAW and it's certainly possible to connect() to that. > > Where do you get this from? And where is this implemented? I don't Sorry it's actually loopback, not broadcast as implemented in Linux. In Linux it's implemented in ip_route_output_slow(). Essentially converted to 127.0.0.1 I think it's traditional BSD behaviour but couldn't find it on a quick look in FreeBSD source (but haven't looked very intensively) Admittedly port 0 is somewhat dodgy for UDP too, but at least in RAW context it might be valid. -Andi -
From: David Miller <davem@...> Subject: Re: follow-up: discrepancy with POSIX Date: Sep 19, 12:52 pm 2007 From: Ulrich Drepper <drepper@redhat.com> Date: Wed, 19 Sep 2007 09:49:09 -0700 > But the spec calls for a "null address" to be used and that's in my > understanding something different from using AF_UNSPEC. It just occured to me that AF_UNSPEC might be used simply because "all zeros" might be a valid real bindable address for some address family. And using AF_UNSPEC avoids that problem entirely. -
From: Ulrich Drepper <drepper@...> Subject: Re: follow-up: discrepancy with POSIX Date: Sep 19, 1:04 pm 2007 -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 David Miller wrote: > It just occured to me that AF_UNSPEC might be used simply > because "all zeros" might be a valid real bindable address > for some address family. And using AF_UNSPEC avoids that > problem entirely. Yes, but for IPv4/6 it's not an issue. Some implementations might handle all-zeros and the spec _currently_ calls for it. In this case an alignment would be good. I guess I'll just go ahead and file a problem report with the spec. Maybe the Unix vendors will test their implementations in provide feedback. - -- ➧ Ulrich Drepper ➧ Red Hat, Inc. ➧ 444 Castro St ➧ Mountain View, CA ❖ -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.7 (GNU/Linux) iD8DBQFG8Vam2ijCOnn/RHQRAlw2AJwPCkD/GdX5YWCjsidhNXkGT71SiQCeLUDX XimSWS2NMI9T8QxnnV3FDQ4= =8XbG -----END PGP SIGNATURE----- -

Shhh....
"We've switched their man pages with an MSDN subscription. Let's see if they notice."
(just kidding!)
--
Program Intellivision and play Space Patrol!