Overview#

Metasmith provides a type system to describe bioinformatics data products based on how it can be generated or consumed by computational tools. This enables a solver to generate Nextflow workflows from target types, given data, and available tools. Generated workflows can then be executed directly or on remote machines using containerization.

Install Metasmith

Usage in 5 steps#

1 - Deploy an agent#

Metasmith itself is not an agent. Rather, Metasmith spawns and deploys agents to machines on which you intend to execute workflows. Agent deployment is automated; you do not need to install Metasmith again if executing workflows on remote machines.

1smith = Agent(...)
2smith.Deploy()

2 - Register inputs#

Metasmith uses a type system to model how data can be consumed by protocols. Inputs must be registered by assigning a type to them.

1inputs = DataInstanceLibrary(...)
2inputs.AddItem(
3    <file path>,
4    <data type>,
5)

3 - Generate workflow#

Protocols describe transformations between data types. By providing the inputs and a list of available transformations, we can ask the agent to generate a workflow to produce target data types by chaining multiple protocols together.

1task = smith.GenerateWorkflow(
2    <inputs>,
3    <transforms>,
4    <targets>,
5)

Tip

Browse the standard library of transforms.

4 - Execute workflow#

Staging a workflow translates it into Nextflow’s syntax and prepares default configurations for various platforms. When ready, execution is delegated to Nextflow.

1smith.StageWorkflow(task)
2smith.RunWorkflow(task)
3smith.CheckWorkflow(task)

5 - Receive outputs#

Produced outputs are presented as a structured data product, but exists as a simple folder to maintain accessibility by both humans and machines.

1outputs = DataInstanceLibrary.LoadFrom(
2    smith.GetResultSource(task),
3    ...
4)

Documentation#