Remember WebDAV? It was a similar concept, but never really found its footing and most of the implementations were pretty shaky. I always thought it was a good idea though.
The rfc is super dumb though. For instance, when handling a PROPFIND request (more or less listing files / folders), it’s not mandatory for the server to honor the Depth header (how many levels are returned). There is also no mechanism for the server to advertise whether or not it’s honoring the Depth header. That means the Depth header is useless because the client has no way to know whether there was only one hierarchy depth or if the server did not honor the Depth header. Therefore, your only option is to always scan the full hierarchy using PROPFIND at each level.
The RFC is full of that kind of crazy gotchas, not to mention the overuse of “MAY” or “SHOULD” which will drive you crazy if you try to implement a client / server.
If you want to go further in insanity, just look at how crazily over engineered the locking mechanism is. I have no words for it.
Unfortunately, even if you think you implemented the whole rfc correctly, your implementation will work with almost nothing as not that much implementations are any good in the wild. A useful WebDAV implementation must be full of vendor-specific workarounds.
There was an even worse protocol back then called CMIS. It was an attempt to define a standard API for content management which turned into this absolute enterprise monstrosity that would make SOAP blush. It was also impossible to implement.
It's interesting looking back but I think developing a standard has a higher chance of success coming from some dude's GitHub than it does with $1T of market cap behind it.
Limitations on depth makes sense, because the actual storage implementation can make recursive retrievals very costly. A folder could be an abstraction for a remote resource.
Of course it would still make sense for server to tell client about this (there are files/no files/I don’t know).
Every single major operating system ships with WebDAV remote filesystem support built in and it works reasonably. Subversion is built out of WebDAV and can be wired to naively auto-commit changes these clients store.
Maybe it's on the decline, but I'd hardly put it as something that "never really found its footing". It is still a decent way to do fileshares over the public internet without sshfs, etc.
That's not really what the link is, though. It's an adaptation layer to turn a random network resource into something that looks like a filesystem.
At the time, I worked for a large porn company and we couldn't host stuff 'in the cloud' because they didn't allow porn there.
We invested a ton of money into an Isilon NAS to store our image/video content and the best way to get stuff off it over HTTP was via webdav. Unfortunately, there wasn't a good Java client.
So, I built a simple proxy that would accept regular GET requests and on the back end, use webdav to retrieve content from the Isilon. In front of that proxy was our CDN.
Since then, Sardine has been the basis for quite a few other projects.
I implemented tve same for ruby. With monkey patching you write normal code for writing files, but use the WebDAV protcol to write files if files had names starting with http.
A WebDAV file could be used as a WebFS Cell. The locking would allow emulation of the compare-and-swap functionality. The reference HTTP Cell: client and server are way less complicated than WebDAV though, as many here have alluded to.
Someone else described my "httpdirlist" specification as "like WebDAV but better" or "like WebDAV but less messy" (actually I do not remember the exact wording).
Now I see that Wikipedia also lists several alternatives to WebDAV too, but I think httpdirlist is good.
I always thought WebDAV was sabotaged by different players for some reason (e.g. Apple's implementation caused loss of data). "Never attribute to malice", but support was so bad that almost no other conclusion was possible.
"Never attribute to malice what can be adequately explained by stupidity" has led me astray so many times in life I've come to largely disbelieve it is a useful mantra. It lets bad actors hide behind stupidity and cause chaos on purpose, while good people let it happen because "it's simply accidental, right"?
I think it is slightly more nuanced, in that bad decisions get made on accident and overlooked on purpose. If you want to sabotage something, put lots of people that make many accidental mistakes on a project and put your people in a position to overlook them.
2. Consider reducing your desktop resolution for webcasts. I can see that there are dialogues open. I can't for the life of me see what's presented in them.
I was not aware of Plan 9's WebFS. It looks like it presents websites in the local file system. IPFS can do something very similar with its content. I have used that before.
The name here comes from using web resources (referred to by url) as building blocks for a file system. I guess "Web for FS" rather than "Web as FS".
The notion of remote services accessed via local filesystem dynamics is pretty well established. Among implementations:
- NFS, particularly with the Solaris-originated concept of automounts over a /net mountpoint.
- Various virtual filesystems. Midnight Commander ("mc") offers several of these, including archive formats (tar, cpio, afio, rpm, deb) and remote (FTP, SSH).
- SMB/CFS/Samba
- Various FUSE filesystems, including again ssh, ftp, and others. These generally require specifying in advance specific mountpoints.
The notion of on-demand access to remote resources over protocols (e.g., http/https, or others), under filesystem dynamics, is interesting -- you can use any general tool, utility, or application for access, mediated through the filesystem by way of drivers, rather than a specific application (e.g., Web browser, FTP client, etc.)
There are numerous issues. In particular, applications tend not to respond well to remote resources disappearing, changing, or failing to return from change requests -- NFS's behaviour with nonresponsive remote hosts is ... notorious.
Consistency, availability, and partition resistance (CAP) are long-standing concerns, and there's no way to solve for all three. I'd add latency as another major consideration.
The general notion of managing and tracking changes locally, and pushing them to remote, has merits. I'm not aware of a "gitfs" ... though of course, one does exist, TIL: https://www.presslabs.com/code/gitfs/ The notion of using git (or other versioning system) as a mediator for remote/local revisioned access seems to have merits. Obviously not viable for very-high-variability systems, but adequate for many occasionally-modified resources.
I'm not sure if you're looking at using your WebFS itself as a publishing mechanism, though in general I think I'd recommend against doing this. For small-n peer-to-peer distribution that's probably workable, but for large-scale provisioning-and-request systems, relying on HTTP or other established transports is likely more sensible.
One area I've recognised as being particularly fraught is the whole notion of security and privacy. Providing unfiltered local access to remote resources which may change arbitrarily is a great way for allowing malware onto local systems -- your transport layer should probably implement some level of security and mounts deny direct execution of content. The fact that remote content could be copied to an executable mountpoint remains, and would make numerous attacks possible.
Similarly: access, update, write, and/or publishing actions all leak considerable information which could be of concern to specific users or organisations. Hash-based indexing (already addressed in this thread) being only one of several such vectors.
SMB/CFS/Samba are not meant to be used directly over the internet. eg. without a tunnel. The best alternative right now is Dropbox unless you are a developer.
The point isn't whether or not these are protocols that are utilised on the naked Internet, but that they offer access to network services via filesystem semantics.
That is, rather than use a specific client or API to access remote content, or copying it locally as a separate step, you simply open a file in an existing application, or, within a program, using fopen() or equivalent operators. The networking is ... translucently ... handled in the background by the filesystem interface and/or driver(s).
The reasons SMB is not generally used or advised over the Internet are worth looking at, as this touches on many of the security / privacy concerns of any such service.
> If you have ever thought "x can probably be used as a file system"
...you might also want to take a look at Storage Combinators[1]. Not quite the same problem space, but abstracting away a bit from both concrete filesystems and other storage mechanisms to get to a composable abstraction.
Note: I am the primary author, and also taking a good look at WebFS for further inspiration... :-)
One issue that networked filesystems have is with mutations, especially with multiple writers. WebDAV, NFS, etc. try to address this with locking, but that doesn't allow simultaneous writes, which means it's dangerous to work offline. It's possible to solve this by using an OT or CRDT algorithm behind the scene. This is what we are building into an extension of HTTP called braid (https://braid.news). It could be useful for a web-backed filesystem. It automates synchronization.
I see very little docs and absolutely no fancy info like gifs or videos explaining what it's possible to do with this tool. I see some example folder, but I can't really say the difference between webfs and ipfs. (Just a little feedback, I mean no offence)
You will probably be interested in Peergos [0][1], which at it's lowest level is an encrypted global filesystem also built on IPFS and also only using the block api.
Encryption keys are derived from a secret and the hash of the data. The secrets are set per Volume. An empty secret would be a convergent encryption strategy, maybe suitable for public data. If the secret is set then the keys will be convergent with other keys generated with the same secret. This amounts to deduplication within a Volume, and privacy.
All a storage provider sees are many small encrypted blobs, so the size of large files is not leaked either.
WebFS stores data in things it calls "Stores". Stores can be anything that takes data and gives back a key to retrieve the data. Right now we have an IPFS and HTTP store built. An FTP server could also work as a store.
If you or a friend run the HTTP or FTP server then it will persist the data for you. IPFS doesn't incentivize data persistence so if WebFS is working on top of IPFS it inherits that problem. You could run WebFS on top of one of the storage networks and persisting your blobs would be incentivized.
WebFS is storage layer agnostic. Give it a Store, and it will give you a file system.
So if WebFS is running on a system with access to Tor SocksPorts, can Stores be onion URLs?
Edit: If not, one could presumably route WebFS through OnionCat's IPv6 /48. But that only works with v2 onions, which are deprecated. However, tinc works with v3 onions. And either of those gives you UDP transport.
Yes. WebFS doesn't actually use any of the file/directory functionality provided by IPFS, or any encryption features. We only use the get/put block functionality. Everything is encrypted in WebFS before being posted to a Store.
The data encryption keys are generated using a secret and the hash of the data being encrypted. That key is stored in the reference to that data. This continues recursively to the superblock which is not encrypted.
What sort of content do you say Tor onions can't host?
"Tor onion" just means that a server is (ideally) only reachable as an onion URL, which is only accessible via the Tor network. There is the limitation that Tor only handles TCP. Otherwise, one can route anything over Tor. In my experience, that includes HTTP(S), FTP, Tahoe-LAFS, SSH, RDP, Mumble, OpenVPN and tinc. And others, if I spent more time remembering what I've played with.
What you're describing is a Tor hidden service. Hidden services are separate from the Tor relay network itself, which is what I thought you were referring to as "Tor onions".
Hidden services are optimized for confidentiality over performance. Using them for bulk data storage would place a lot of load on the relay network, and it's not clear what security problem this arrangement would solve.
As far as I know, "hidden service" is deprecated, with "onion service" the current term. And it does tend to get shortened to "onion". But I admit that it was confusing. Because relays used to be called "onion routers". Which is also more or less deprecated, I think.
The security problem is Stores being physically located and compromised, based on IP addresses found in traffic logs.
Well I fear even if the information is encrypted alone once quantum computing breaks modern AES encryption standards that’s going to be a yikes. So I’d be more comfortable with encryption as well as access controls.
This is a legitimate concern. WebFS is designed for the p2p storage use case. Persisting data with p2p storage means that it can live forever. All the secrets in WebFS are randomly generated and there are no user supplied (potentially weak) passwords.
w.r.t. quantum computing: it is possible for WebFS to use symmetric cryptography for all remote data. Although, many Cell implementations in the near term will likely use elliptic curves or RSA.
I guess. But access controls really just keep the punters out. Any serious adversary will just track down the stores. And even if they're on dedicated servers with FDE, keys can be obtained from RAM.
Just to clarify: All data is encrypted on the client, going after a server backing a Store will get you encrypted blobs. Encryption keys would not exist on the server in plaintext.
HTTP is basically a filesystem protocol that supports magical files -- not too unlike sharing named pipes over SMB. Not too unlike what it'd be like if one could open(2) AF_LOCAL sockets on Unix/POSIX systems instead of having to connect(2) to them -- if that had been so in 1982 in BSD, it would be true now, NFS would have supported the same, etc.
https://en.wikipedia.org/wiki/WebDAV