Issue 1062014: fix for 764437 AF_UNIX socket special linux socket names (original) (raw)

Created on 2004-11-07 18:18 by irmen, last changed 2022-04-11 14:56 by admin. This issue is now closed.

Files
File name Uploaded Description Edit
socketmodule.patch irmen,2004-11-07 18:18 patch for src/Modules/socketmodule.c
linux_socket_abstract_namespace.diff arigo,2006-04-12 13:19 Another patch (try #2)
socket.diff nnorwitz,2006-04-14 05:01 nn v1
socket.diff nnorwitz,2006-04-19 06:29 nn v2
Messages (15)
msg47237 - (view) Author: Irmen de Jong (irmen) (Python triager) Date: 2004-11-07 18:18
The patch addresses two things: 1. the socket name (obtained using getsockname() or getpeername() ) will now be correct in case of the special linux socket names (starting with 0 byte) 2. the socket module now has a new constant: UNIX_PATH_MAX , that can be used to build correctly-sized path names.
msg47238 - (view) Author: Johannes Gijsbers (jlgijsbers) * (Python triager) Date: 2004-11-07 20:11
Logged In: YES user_id=469548 As you noted on IRC: this needs a new regression test.
msg47239 - (view) Author: Irmen de Jong (irmen) (Python triager) Date: 2004-11-08 11:05
Logged In: YES user_id=129426 Not only does it need a new regression test, but if it is accepted as-is, the socket module docs need to be updated too (mention the new constant for use with the special naming case)
msg47240 - (view) Author: Armin Rigo (arigo) * (Python committer) Date: 2006-04-12 12:57
Logged In: YES user_id=4771 I have made an independent patch for this, based on another interpretation of "man 7 unix": all the names in the Linux abstract namespace don't have to be UNIX_PATH_MAX characters long; instead, their length is remembered by the system and available as 'addrlen', which is correctly passed around in socketmodule.c. The attached patch allows these special names of any length to be set and read, and the length is preserved. It does not expose UNIX_PATH_MAX because it's not an essential piece of information. I've written a small test too. I don't think the docs need updating for this patch, beyond a mention in the NEWS file.
msg47241 - (view) Author: Armin Rigo (arigo) * (Python committer) Date: 2006-04-12 13:19
Logged In: YES user_id=4771 Added "#ifdef linux" in my patch, following the idea from the original patch.
msg47242 - (view) Author: Neal Norwitz (nnorwitz) * (Python committer) Date: 2006-04-13 04:09
Logged In: YES user_id=33168 Armin, do you want someone to review this? Is it small/straightforward enough that it could be checked in? Just trying to figure out if there's anything I should do.
msg47243 - (view) Author: Armin Rigo (arigo) * (Python committer) Date: 2006-04-13 12:10
Logged In: YES user_id=4771 Hi Neal. Although the patch is small I'd appreciate a quick review, yes. Also, Irmen, it would be nice to know if my solution -- trying to preserve the length of the addresses -- suits your original use case as well.
msg47244 - (view) Author: Irmen de Jong (irmen) (Python triager) Date: 2006-04-13 14:31
Logged In: YES user_id=129426 Armin, I don't have access to a Linux machine anymore so I can't really comment on this issue anymore, sorry. Also this patch was not made for a specific use case I had, it was a patch for an open bug filed by someone else (bug 764437). Previous discussion can be found there.
msg47245 - (view) Author: Neal Norwitz (nnorwitz) * (Python committer) Date: 2006-04-14 05:01
Logged In: YES user_id=33168 Wow! Now I see why you wanted review. Modules/socketmodule.c: In the first #ifdef linux section (968+), why the calc for addrlen? Isn't the len just sizeof(a->sun_path) and that can be passed to FromStringAndSize()? In the second #ifdef linux section (1098+), should if (len > sizeof ...) be >= like the other check? It's probably easier to show what I mean in code. Attached is another patch. I didn't test it. There's a big comment that we should address in either patch about a negative length issue.
msg47246 - (view) Author: Armin Rigo (arigo) * (Python committer) Date: 2006-04-14 08:33
Logged In: YES user_id=4771 You're missing the point of my patch. It is to preserve the originally specified length of the name. The names '\x00abc', '\x00abc\x00' and '\x00abc\x00\x00' are all different for the Linux IP stack, as far as I can tell. In operations like bind() we specify the length of the name by setting addrlen to the length of the header plus the number of bytes actually used in the sun_path array. The kernel code remembers this length, and passes it back in &addrlen to functions like getsockaddr(). That's why I compute the number of bytes that are really meaningful as the addrlen minus the headers, instead of just UNIX_PATH_MAX. About the second #ifdef, I really meant the > to be > and not >=, because it is acceptable to pass a name containing exactly UNIX_PATH_MAX bytes -- the abstract namespace names don't have to be followed by a terminating '\x00'. By contrast, the regular names must not be longer than UNIX_PATH_MAX-1 characters to allow for the extra '\x00'. In some sense, the Linux abstract namespace is rather strange from the point of view of C, but not if you think about it as you would think about Python strings. Indeed, apart from the requirement that they start with '\x00', these names have an intrinsic length which is not related to any embedded '\x00'. It is stored together with the name in some kernel structures, but outside the sun_path array of bytes. But enough theory: the patch you propose don't pass its own new test on my machine because of this :-) Does it pass on yours? I'm not too concerned about addrlen ending up shorter than the size of the headers. That would be a Linux kernel bug.
msg47247 - (view) Author: Neal Norwitz (nnorwitz) * (Python committer) Date: 2006-04-18 05:42
Logged In: YES user_id=33168 My man page (man 7 unix) says: sun_family always contains AF_UNIX. sun_path contains the zero-terminated pathname of the socket in the file system. If sun_path starts with a zero byte it refers to the abstract namespace maintained by the Unix protocol module. The socket's address in this namespace is given by the rest of the bytes in sun_path. Note that names in the abstract namespace are not zero-terminated. I read that (rest of the bytes in sun_path) to mean that if the first byte is NUL, the address is all the remaining bytes in sun_path (ie, sun_path is treated as binary). Also since the abstract namespace is not NUL-terminated. I don't know if the man page is correct. Do you understand the manpage the way I do?
msg47248 - (view) Author: Armin Rigo (arigo) * (Python committer) Date: 2006-04-18 11:52
Logged In: YES user_id=4771 The man page is indeed confusing. I was relying on small experiments. On my Linux box at least, the address does not have to occupy the whole of the sun_path array. You can really do the following: set sun_path to '\x00!\x00\x00\x00\x00\x00\x00...' then call bind() with an addrlen that covers only the first two characters '\x00!', and then call bind() on another socket with the same sun_path but an addrlen that covers the first three characters '\x00!\x00'. The two addresses don't conflict and are regarded as different. Conversely, if I try to bind() two sockets with the same addrlen and a sun_path that differs after the number of bytes included in the addrlen, I get an "Address already in use", thus showing that addrlen is really meaningful. Another hint is that if you let the system pick a default local address for your socket, as e.g. is done by s=socket(AF_UNIX); s.connect(...), then getsockname() will return a name that looks like '\x00000003'. More precisely, the addrlen is much shorter than the maximum that would correspond to the whole sun_path array. A more practical example: it is bogus to ignore the addrlen and always return the whole sun_path array, because if we did that, then the following kind of code would fail: >>> s1 = socket(AF_UNIX); s2 = socket(AF_UNIX) >>> s1.bind('\x00hello') >>> s2.connect(s1.getsockname()) because getsockname() would return a string that is 108 characters long, and the connect would fail. (Just try it out! It gives 'Connection refused'.)
msg47249 - (view) Author: Neal Norwitz (nnorwitz) * (Python committer) Date: 2006-04-19 06:29
Logged In: YES user_id=33168 I guess I really need to test this on my box. I don't know if the man page is correct or if I'm reading it correct. BTW what linux are you running (kernel version/distro)? I'm 2.6.8-gentoo-r4.
msg47250 - (view) Author: Neal Norwitz (nnorwitz) * (Python committer) Date: 2006-04-19 06:33
Logged In: YES user_id=33168 Damn, I wasn't finished. I uploaded a new patch. The only difference is that I added 2 test cases. I ran this under valgrind and there were no errors. I was concerned about the boundaries, particularly given how poor the man page is. Are anonymous namespaces supported on other platforms? I'm fine with this going in, but it needs an update to Misc/NEWS and Doc/lib/libsocket.tex. Hmmm, there's nothing remotely close in the doc currently about AF_UNIX. I'll leave it to your best judgment Armin.
msg47251 - (view) Author: Armin Rigo (arigo) * (Python committer) Date: 2006-04-19 11:51
Logged In: YES user_id=4771 Checked in as r45560 with a Misc/NEWS entry. No update to the docs (they basically refer to the platform's socket documentation for this kind of details). I don't know if other platforms support something similar; the man page documents this extension as Linux-specific. I wouldn't worry about it unless somebody shows up and points to another platform...
History
Date User Action Args
2022-04-11 14:56:08 admin set github: 41144
2004-11-07 18🔞12 irmen create