Improve preload mechanism to support random access via ipfs.cat · Issue #3510 · ipfs/js-ipfs (original) (raw)

(Not sure if this should be here or in js-ipfs or elsewhere, feel free to move).

As I understand it, the current transport for sharing IPFS data browsers relies on a 'preload' node, which 'preloads' all the blocks of a multihash in response to a /api/v0/refs API call.. The refs call will start pull all the blocks of the hash onto the preload node over the delegate websocket connection..

This makes sense for a default use case where the entire hash should be shared by default. However, this is less than ideal for a use case where a large amount of data is being shared over IPFS, and data should only be loaded on-demand.

I've been able to implement the following 'alternative preload' setup, and wonder if this could be improved and/or supported API as an option to make in-browser usage more scalable for large amounts of data, or if this is too specific of a use case?

The goal is to load data via ipfs.cat() random access, and avoid preloading anything that is not actually needed.

In my setup, the root multihash contains several files all in the root dir.

1). Disable preload in the config, to avoid making the http /api/v0/refs call by default.

2). When new data is added, making an http /api/v0/ls call to the preload node. This works since I'm sharing several files all in the root directory.

3). When reading data from an ipfs hash in a different browser, the preloading also make an http /api/v0/ls call to the preload node in place of /api/v0/refs. Then, locally call ipfs.ls() in the browser and quickly fetch a list of files.

To load the data in, ideally would just call ipfs.cat() with offset and length to fetch the necessary blocks after calling ipfs.ls(). Unfortunately, this doesn't seem to work currently. Instead, my current workaround is simply to call the http /api/v0/cat on the preload node, and this works perfectly as expected! The preload node searches for the blocks necessary for cat and only preloads those, and only loads those from the browser on the other end as well, allowing for quick loads from a large ipfs hash!

To summarize, a few questions from this:

Does it make sense to support an alternate preload behavior, similar to the above? I'm not sure if ls is the right command, but a way to specify which blocks should be preloaded? A non-recursive /api/v0/refs call, or something else?
Is this a generic enough use case that could make sense to other applications?
For some reason the ipfs.cat() in the end does not work, but calling /api/v0/cat to the preload node does. I would think that the local cat() would do discovery over the preload websocket connection, but it appears that it does not. Maybe I'm doing something wrong or the discovery can't work this way?