How to convert JupyterLab S3 Contents Manager to use a custom API instead? (original) (raw)

Question:

I’m working with a JupyterLab extension that currently uses AWS S3 for file storage via the AWS SDK. For security reasons, I want to replace the direct S3 access with API calls to my backend server, which will handle the S3 operations server-side.

Here’s my current approach:

In index.ts, I’ve modified the authFileBrowser plugin to use placeholder credentials:

const authFileBrowser: JupyterFrontEndPlugin<IS3Auth> = {
  id: 'jupydrive-s3:auth-file-browser',
  description: 'The default file browser auth/credentials provider',
  provides: IS3Auth,
  activate: (): IS3Auth => {
    return {
      factory: async () => {
        console.log('Setting up S3/R2 proxy via server API...');
        
        // Since we're using API endpoints for all S3 operations,
        // we just need a minimal configuration with the bucket name
        const config = {
          bucket: 'my-bucket', // This is just for display purposes
          root: '',
          config: {
            forcePathStyle: true,
            // These are placeholder values since actual S3 operations
            // will be handled by the server
            endpoint: 'https://api-proxy',
            region: 'auto',
            credentials: {
              accessKeyId: 'proxy-auth',
              secretAccessKey: 'proxy-auth'
            }
          }
        };
        
        console.log('S3/R2 proxy setup complete');
        return config;
      }
    };
  }
};

And in s3contents.ts, I’m replacing the S3 operations with API calls:

async get(
  path: string,
  options?: Contents.IFetchOptions
): Promise<Contents.IModel> {
  path = path.replace(this._name + '/', '');

  // format root the first time contents are retrieved
  if (!this._isRootFormatted) {
    this._root = await this.formatRoot(this._root ?? '');
    this._isRootFormatted = true;
  }

  try {
    // Use API endpoint instead of direct S3 access
    const response = await fetch(`/api/s3/contents?path=${encodeURIComponent(path)}&root=${encodeURIComponent(this._root)}`, {
      method: 'GET'
    });

    if (!response.ok) {
      throw new Error(`Failed to fetch contents: ${response.statusText}`);
    }

    const data = await response.json();
    Contents.validateContentsModel(data);
    return data;
  } catch (error) {
    console.error('Error fetching contents:', error);
    throw error;
  }
}

// Similar changes for other methods like save, delete, rename, etc.

My questions are:

  1. Is this the correct approach to replace S3 SDK operations with API calls?
  2. What specific API endpoints do I need to implement on my backend to fully support JupyterLab’s contents manager functionality?
  3. Are there any special considerations for handling binary files, large files, or streaming content?
  4. How should I handle authentication and authorization for these API calls?
  5. Are there any examples or reference implementations of a custom API-based contents manager for JupyterLab?

I’m trying to maintain all the functionality of the S3 contents manager but with improved security by keeping S3 credentials on the server side.

Krish_C March 12, 2025, 10:42pm 2

jtp March 13, 2025, 7:31am 3

In this case, you may want to have a look at GitHub - QuantStack/jupyter-drives: Jupyter Server supporting JupyterLab IDrive, which can be configured on the backend.

@jtp Thank you for suggesting jupyter-drives. I’m currently using JupyterLite as a static application within my Rails app (in the public folder). While jupyter-drives looks promising, I have a few questions:

  1. Since I’m using JupyterLite in a static way, can I still use jupyter-drives? I understand it requires a server component, but I don’t have a Python server running.
  2. If I want to integrate it with my Rails backend instead of using the Python server component, what would be the best approach? Would I need to:
  1. Are there any examples or documentation specifically about integrating jupyter-drives with non-Python backends or static JupyterLite deployments?

I’d appreciate any guidance on how to best implement drive functionality in my specific setup.

Krish_C March 26, 2025, 9:00pm 5

any thoughts on this? Curious to know options to make this work ? @jtp