Saturday, August 28, 2010

Two pieces of code someone really ought to write

Gripe mode on:

If you think about it, sockets are the elephant in the unix living room. Everything in unix is supposed to be a file. "Real" files are, of course, files. Remote files are files. Devices are "files". And with the /proc filesystem, even parts of the kernel are files. Everything is a file.

Except sockets.

Sockets aren't files, but they really should be, and it mystifies me that someone hasn't written a FUSE file system that presents sockets as files. It seems like a no-brainer. Instead of having to call socket() and bind() and connect() and fill in sock_addr structures, I should just be able to call open("/fuse/sockets/", "r+") and get a descriptor for a TCP socket to port 80. It should be straightforward to implement this on top of FUSE.

Item the second:

Apache's mod_proxy allows Apache to act as a forwarding proxy to other machines, or even the same machine via a TCP connection, but NOT to a local unix-domain socket.

So suppose that I want to have Apache as my front-end to handle security and serve static content, and I want it to proxy interactive pages to a custom standalone server app. No problem you say, just pick a non-publically-accessible TCP port for the custom app and point mod_proxy to that port for the relevant URLs.

But now suppose that I have a single machine with a single IP address hosting multiple virtual servers, and I want to replicate this setup for each virtual server, i.e. I want each virtual server to have its own instantiation of the custom server application. Now I have to *manually* assign *each* instantiation of the app to a separate TCP port number. If I have hundreds or thousands of virtual servers on the same machine (Oh? You think that's not reasonable? Can you say "multi-core architecture"?) that can become a serious administrative (to say nothing of security) nightmare.

Wouldn't it be better if instead of assigning the custom server app to a TCP port number I could instead assign each one to a unix domain socket? Unix domain sockets don't have numbers, they have *names*, so I can just name each socket after the virtual server that it serves. Voila! no more manual assignment of servers to port numbers and associated administrative headaches.

Lighttpd can do this, but Apache can't (or at least couldn't the last time I checked). It's a conceptually simple thing to do, but actually making it work with the rest of the Apache infrastructure is not so easy for someone who isn't already intimately familiar with Apache's innerds.

If it turns out that these things already exist I'd be most grateful for a pointer. If they indeed do not exist, I hope someone will write them. I'd be willing to pay for the development if anyone out there wants to send me a proposal.


Unknown said...

Unknown said...

Ah sorry to comment again. This feature used to not be enabled by default in Debian's bash (thus also not enabled in Ubuntu's). However it looks like it has changed in Ubuntu as of Bug #215034

Unknown said...

About sockets...

I seem to recall that Plan 9 implemented them as file objects.

Nobody's ported their socket arch. to BSD or linux yet, however.

Unknown said...

have you tried this from bash?

$ cat </dev/tcp/

well, OK, is not exactly what you wanted, but, hey _someone_ thought of that already and wrote the code :)

Ron said...

> /dev/tcp

That seems like a Horrible Hack (tm) to me.

I want to be able to access /dev/tcp (or whatever it ends up being called) from anywhere, not just bash. This really belongs in FUSE (or the kernel, but doing it in FUSE would be a lot easier, not to mention more portable).

> Plan 9 implemented them as file objects. Nobody's ported their socket arch. to BSD or linux yet, however.

Once again, this is why it ought to be done in FUSE.

Ron said...

> have you tried this from bash?

Yes. It looks slick at first, but it's not the Right Thing. This works:

[ron@mickey:~]$ cat < /dev/tcp/
55436 10-08-28 21:42:15 50 0 0 155.2 UTC(NIST) *

But this doesn't:

[ron@mickey:~]$ python
Python 2.6.1 (r261:67515, Feb 11 2010, 00:51:29)
[GCC 4.2.1 (Apple Inc. build 5646)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> open('/dev/tcp/')
Traceback (most recent call last):
File "", line 1, in
IOError: [Errno 2] No such file or directory: '/dev/tcp/'

Michael said...

FastCGI can use unix sockets. It's even a little bit faster than TCP. :)

Anonymous said...

thats how anything works in plan9. anything is a file, including your ip-stack:

Unknown said...

I'm just gonna take this time to pimp my project:

And zeromq:


Cody said...

nginx can proxy via domain sockets.

YesThatTom said...

The Cheswick/Bellovin security book has a great library for making socket calls easier but I don't think they published the code. You should adopt their API. It is so clean I wish I had thought of it first!

Anonymous said...

If you're running a large, heavily loaded web server you really don't want to be using Apache anyway, so it seems to me that Apache's lack of the ability to proxy to Unix domain sockets isn't that big a deal.

For what particular reason did you want the "open TCP socket from filesystem" feature? It seems to me that one reason it may not exist is that there are already plenty of libraries (such as Curl) and programs (such as netcat) that make the job easy enough. (And it's not particularly hard to use the sockets interface directly, anyway--just a few minutes of coding once you're familiar with it.)

Ron said...

> If you're running a large, heavily loaded web server you really don't want to be using Apache anyway

Good to know. I thought it was the standard. What would you recommend using instead? (Note that I need to run WEBDAV so nginx won't work.)

> For what particular reason did you want the "open TCP socket from filesystem" feature

So that I can take a server that has been running on a unix domain socket and move it to a different machine (and vice-versa) without having to rewrite the client.

Anonymous said...

Apache is the standard for lower-volume websites, where features are more important than performance, but it suffers badly at high volumes because of its "one thread or process per active request" architecture. nginx and lighttpd (which both use an event-based architecure) are the two major ones for high-traffic web sites. You'll want lighttpd, since it does have a WebDAV module.

Note, however, that if you currently have any code running within your apache processes (e.g., mod-perl- or PHP-style stuff) you'll need to switch to FastCGI for those. This usually isn't such a big deal, but of course it all depends on the specific situation.

As for the TCP-via-filesystem thing, that makes perfect sense now. Another option for doing what you need would be to write a little proxy that listens on the old Unix domain socket and forwards the connection to the new server. Still, even though that's less code than the FUSE solution, it's still a good chunk of it, and it may be easier just to modify the client. Note that modifying the client to talk directly to the server, rather than go through a proxy (be it FUSE or the simpler option) will also be more efficient.

cpc26 said...

Works with SBCL.

Ron said...

So I actually have taken a couple of whacks at doing the net filesystem myself using python-fuse. The problem is I can't get even the example code to run. I try the xmp file system example and all that happens is this:

[ron@mickey:~/devel/pkg/fuse-python-0.2]$ mkdir ~/foo1
[ron@mickey:~/devel/pkg/fuse-python-0.2]$ mkdir ~/foo2
[ron@mickey:~/devel/pkg/fuse-python-0.2]$ python example/ ~/foo1 ~/foo2
[ron@mickey:~/devel/pkg/fuse-python-0.2]$ ls ~/foo2
ls: /Users/ron/foo2: Input/output error

I get the same result with the bbfs file system code found here:

Took a look at cl-fuse, but it doesn't seem to have any documentation and I don't have time to reverse-engineer it.