Overview#
Metasmith provides a type system to describe bioinformatics data products based on how it can be generated or consumed by computational tools. This enables a solver to generate Nextflow workflows from target types, given data, and available tools. Generated workflows can then be executed directly or on remote machines using containerization.
Usage in 5 steps#
1 - Deploy an agent#
Metasmith itself is not an agent. Rather, Metasmith spawns and deploys agents to machines on which you intend to execute workflows. Agent deployment is automated; you do not need to install Metasmith again if executing workflows on remote machines.
1smith = Agent(...)
2smith.Deploy()
2 - Register inputs#
Metasmith uses a type system to model how data can be consumed by protocols. Inputs must be registered by assigning a type to them.
1inputs = DataInstanceLibrary(...)
2inputs.AddItem(
3 <file path>,
4 <data type>,
5)
3 - Generate workflow#
Protocols describe transformations between data types. By providing the inputs and a list of available transformations, we can ask the agent to generate a workflow to produce target data types by chaining multiple protocols together.
1task = smith.GenerateWorkflow(
2 <inputs>,
3 <transforms>,
4 <targets>,
5)
Tip
Browse the standard library of transforms.
4 - Execute workflow#
Staging a workflow translates it into Nextflow’s syntax and prepares default configurations for various platforms. When ready, execution is delegated to Nextflow.
1smith.StageWorkflow(task)
2smith.RunWorkflow(task)
3smith.CheckWorkflow(task)
5 - Receive outputs#
Produced outputs are presented as a structured data product, but exists as a simple folder to maintain accessibility by both humans and machines.
1outputs = DataInstanceLibrary.LoadFrom(
2 smith.GetResultSource(task),
3 ...
4)