How to convert JupyterLab S3 Contents Manager to use a custom API instead? (original) (raw)

Question:

I’m working with a JupyterLab extension that currently uses AWS S3 for file storage via the AWS SDK. For security reasons, I want to replace the direct S3 access with API calls to my backend server, which will handle the S3 operations server-side.

Here’s my current approach:

In index.ts, I’ve modified the authFileBrowser plugin to use placeholder credentials:

const authFileBrowser: JupyterFrontEndPlugin<IS3Auth> = {
  id: 'jupydrive-s3:auth-file-browser',
  description: 'The default file browser auth/credentials provider',
  provides: IS3Auth,
  activate: (): IS3Auth => {
    return {
      factory: async () => {
        console.log('Setting up S3/R2 proxy via server API...');
        
        // Since we're using API endpoints for all S3 operations,
        // we just need a minimal configuration with the bucket name
        const config = {
          bucket: 'my-bucket', // This is just for display purposes
          root: '',
          config: {
            forcePathStyle: true,
            // These are placeholder values since actual S3 operations
            // will be handled by the server
            endpoint: 'https://api-proxy',
            region: 'auto',
            credentials: {
              accessKeyId: 'proxy-auth',
              secretAccessKey: 'proxy-auth'
            }
          }
        };
        
        console.log('S3/R2 proxy setup complete');
        return config;
      }
    };
  }
};

And in s3contents.ts, I’m replacing the S3 operations with API calls:

async get(
  path: string,
  options?: Contents.IFetchOptions
): Promise<Contents.IModel> {
  path = path.replace(this._name + '/', '');

  // format root the first time contents are retrieved
  if (!this._isRootFormatted) {
    this._root = await this.formatRoot(this._root ?? '');
    this._isRootFormatted = true;
  }

  try {
    // Use API endpoint instead of direct S3 access
    const response = await fetch(`/api/s3/contents?path=${encodeURIComponent(path)}&root=${encodeURIComponent(this._root)}`, {
      method: 'GET'
    });

    if (!response.ok) {
      throw new Error(`Failed to fetch contents: ${response.statusText}`);
    }

    const data = await response.json();
    Contents.validateContentsModel(data);
    return data;
  } catch (error) {
    console.error('Error fetching contents:', error);
    throw error;
  }
}

// Similar changes for other methods like save, delete, rename, etc.

My questions are:

Is this the correct approach to replace S3 SDK operations with API calls?
What specific API endpoints do I need to implement on my backend to fully support JupyterLab’s contents manager functionality?
Are there any special considerations for handling binary files, large files, or streaming content?
How should I handle authentication and authorization for these API calls?
Are there any examples or reference implementations of a custom API-based contents manager for JupyterLab?

I’m trying to maintain all the functionality of the S3 contents manager but with improved security by keeping S3 credentials on the server side.

Krish_C March 12, 2025, 10:42pm 2

jtp March 13, 2025, 7:31am 3

In this case, you may want to have a look at GitHub - QuantStack/jupyter-drives: Jupyter Server supporting JupyterLab IDrive, which can be configured on the backend.

@jtp Thank you for suggesting jupyter-drives. I’m currently using JupyterLite as a static application within my Rails app (in the public folder). While jupyter-drives looks promising, I have a few questions:

Since I’m using JupyterLite in a static way, can I still use jupyter-drives? I understand it requires a server component, but I don’t have a Python server running.
If I want to integrate it with my Rails backend instead of using the Python server component, what would be the best approach? Would I need to:

Modify the jupyter-drives frontend to communicate with my Rails endpoints?
Create my own custom extension that mimics jupyter-drives but works with Rails?
Or is there another recommended approach?

Are there any examples or documentation specifically about integrating jupyter-drives with non-Python backends or static JupyterLite deployments?

I’d appreciate any guidance on how to best implement drive functionality in my specific setup.

Krish_C March 26, 2025, 9:00pm 5

any thoughts on this? Curious to know options to make this work ? @jtp