GitHub - dnanexus-archive/dx-cwl: Import and run CWL workflows on DNAnexus (alpha) (original) (raw)

dx-cwl

Import and run CWL workflows on DNAnexus

THIS IS AN ALPHA-PHASE PROJECT. Please use at your own risk or contact DNAnexus if you are interested.

We have tested this implementation on a few practical workflows of varying complexity and are working towards more complete support of the specification. More tests, documentation, and improvements to the user experience to come shortly.

The motivation behind dx-cwl is to compile a CWL workflow definition to a DNAnexus workflow. This approach enables the user to execute a CWL workflow on DNAnexus and take advantage of the platform's many features including secure execution on multiple regions/clouds. We use a reference CWL implementation and data structures when possible to adhere maximally to the standard. CWL types are mapped directly to DNAnexus types when possible and when not, these structures exist as a general JSON data types within the platform.

Run with DNAnexus CLI

Coming soon.

Install code in this repository

Pre-requisites

Executing dx-cwl directly

To compile a workflow, simply point dx-cwl to a local workflow on your platform and be sure to provide your authentication token and project name. The example below is a test CWL of a bcbio workflow.

python dx-cwl compile-workflow examples/test_bcbio_cwl/somatic/somatic-workflow/main-somatic.cwl --token <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>T</mi><mi>O</mi><mi>K</mi><mi>E</mi><mi>N</mi><mo>−</mo><mo>−</mo><mi>p</mi><mi>r</mi><mi>o</mi><mi>j</mi><mi>e</mi><mi>c</mi><mi>t</mi></mrow><annotation encoding="application/x-tex">TOKEN --project </annotation></semantics></math></span><span class="katex-html" aria-hidden="true"><span class="base"><span class="strut" style="height:0.7667em;vertical-align:-0.0833em;"></span><span class="mord mathnormal" style="margin-right:0.02778em;">TO</span><span class="mord mathnormal" style="margin-right:0.07153em;">K</span><span class="mord mathnormal" style="margin-right:0.10903em;">EN</span><span class="mspace" style="margin-right:0.2222em;"></span><span class="mbin">−</span><span class="mspace" style="margin-right:0.2222em;"></span></span><span class="base"><span class="strut" style="height:0.854em;vertical-align:-0.1944em;"></span><span class="mord">−</span><span class="mord mathnormal">p</span><span class="mord mathnormal">ro</span><span class="mord mathnormal" style="margin-right:0.05724em;">j</span><span class="mord mathnormal">ec</span><span class="mord mathnormal">t</span></span></span></span>PROJECT

To execute a workflow much like you would with the reference implementation, simply upload the data files and CWL input file onto the platform and run this command on your local installation of dx-cwl.

python dx-cwl run-workflow main-somatic/main-somatic test_bcbio_cwl/somatic/somatic-workflow/main-somatic-samples.json

Here main-somatic is the workflow that was compiled to DNAnexus and it is contained in the main-somatic/ directory on the platform along with other applications and resources required for the workflow. test_bcbio_cwl/ is literally a copy of the files in that repository on the DNAnexus cloud.

Note that the compiled workflow can be used directly as a typical workflow on DNAnexus as well.

Please see the ENCODE example for a more detailed walk-through.

External contributors