Salesforce - Unstructured (original) (raw)

Connect Salesforce to your preprocessing pipeline, and use the Unstructured Ingest CLI or the Unstructured Ingest Python library to batch process all your documents and store structured outputs locally on your filesystem.

The requirements are as follows.

openssl genrsa -out MyPrivateKey.pem -traditional  
openssl req -new -key MyPrivateKey.pem -out MyCertificateSigningRequest.csr  
openssl x509 -req -in MyCertificateSigningRequest.csr -signkey MyPrivateKey.pem -out MyCertificate.crt -days 365  

Of course, you can change these preceding example filenames as needed. Be sure to store these generated files in a secure location.
To create a Salesforce connected app, do the following:

  1. Log in to your Salesforce account.
  2. In the top navigation bar, click the Quick Settings (gear) icon, and then click Open Advanced Setup.
  3. In the Home tab, under Platform Tools, expand Apps, and then click App Manager.
  4. Click New Connected App.
  5. With Create a Connected App selected, click Continue.
  6. At a minimum, fill in the following, and then click Save:
    • Connected App Name
    • API Name (can be the same as Connected App Name, but do not use spaces or punctuation)
    • Contact Email
    • Under API (Enable OAuth Settings), check Enable OAuth Settings.
    • For Callback URL, entering https://localhost is okay if you won’t be using this connected app for other special authentication scenarios.
    • Check Use digital signatures, click Choose File, and browse to and select your certificate (.crt) file.
    • For Selected OAuth Scopes, move the following entries from the Available OAuth Scopes list to the Selected OAuth Scopes list:
      * Manage user data via APIs (api)
      * Perform requests on your behalf at any time (refresh_token, offline_access)
    • Uncheck Require Proof Key for Code Exchange (PKCE) Extension for Supported Authorization Flows.
    • Leave Require Secret for Web Server Flow checked.
    • Leave Require Secret for Refresh Token Flow checked.
    • Check Enable Authorization Code and Credentials Flow.
  7. On the connected app’s details page, click Manage, click Edit Policies, set the following under OAuth Policies, and then click Save:
    • Set Permitted Users to All users may self-authorize.
    • Set IP Relaxation to Relax IP restrictions.
    • Set Refresh Token Policy to Refresh token is valid until revoked.
https://login.salesforce.com/services/oauth2/authorize?response_type=code&client_id=<client-id>&redirect_uri=https%3A%2F%2Flocalhost  

The Salesforce connector dependencies:

pip install "unstructured-ingest[salesforce]"

You might also need to install additional dependencies, depending on your needs. Learn more.

The following environment variables:

Now call the Unstructured Ingest CLI or the Unstructured Ingest Python library. The destination connector can be any of the ones supported. This example uses the local destination connector.

This example sends data to Unstructured for processing by default. To process data locally instead, see the instructions at the end of this page.

For the Unstructured Ingest CLI and the Unstructured Ingest Python library, you can use the --partition-by-api option (CLI) or partition_by_api (Python) parameter to specify where files are processed: