Skip to content

data2code/tamarind

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

tamarind - Python wrapper and command line tools for interacting with Tamarind.bio

Installation

pip install .

Configuration

To use tamarind, we should have acquired an API Key. Please set environment variable (use our real key).

export TAMARIND_API_KEY=01234567-8901-2345-6789-012345678901

Command Line Tools

The following commands (under /bin in the package) should have been installed into PATH:

tmrrun - run a model command line tmrmonitor - check jobs under our account, used to monitor submitted jobs until they complete. tmrdownload - download prediction results tmrdeljob - delete job entries, for batch jobs the associated upload folder will be deleted as well tmrdelfile - delete files/folders

Use -h to list the syntax.

Frequently Used Commands:

List all my jobs:

tmrmonitor -l
	List all the jobs, grouped by their status, such as Completed, Stopped, etc.
	Entries end with "/*" are batch submissions.
	-l -e: expand batch submissions into their sub jobs.		
	-o job.csv: will save job entries into a .csv file

Monitor the progress of running jobs (or any jobs)

tmrmonitor -t alphafold
	Monitor alphafold jobs only
tmrmonitor myjob
	Monitor job entry myjob
tmrmonitor mybatch
	Monitor batch job mybatch
tmrmonitor 
	Moniotr all jobs/batches as a whole

Download results

tmrdownload -o out myjob
tmrdownload -o out mybatch
tmrdownload -o out myjob mybatch
	Download results into designated output folder
	Multiple jobs/batches can be provided.
tmrdownload --all
	Download results for all jobs/batches.

If a model is able to produce a metrics file, i.e., if tamarind.model.MyModel.results method is defined, results.MyModel.csv file(s) will be generated and placed into the corresponding output folder(s)

Delete Jobs/Batches

tmrdeljob myjob
tmrdeljob mybatch
tmrdeljob myjob mybatch
tmrdeljob --all

Files uploaded for a specific batch submission will be deleted automatically. We only delete the job entries and associated user-uploaded files. Results generated by the system will be automatically deleted after 5 days. They are untouched by tmrdeljob.

Delete User Uploaded Files

tmrdelfile mybatch
tmrdelfile --all

Run a Model

If tamarind/model/MyModel.py is provided, we can use it to run a model in batch mode.

tmrrun list
	List all models that support command line execution

Run AlphaFold

tmrrun alphafold -h
	List syntax for running a model

tmrrun alphafold -n myrun -o output_folder input.csv
	All arguments after the model name are passed to the MyModel.py, which can also be used as a command
	i.e., behind the scene, the command "tamarind/model/MyModel.py [options]" is executed.

In this specific example, apply model alphafold to the input.csv file, monitor progress, save results into output_folder. input.csv is a CSV file containing at least two columns, name and sequence name: required column, must be unique within the input, avoid using special characters sequence: required column, ":" is used to concatenate chains template: optional, custom template file in .cif format This command will enter the monitoring mode, wait till the job is completed.

tmrdeljob myrun
	If the results look good and we do not need the record any more, this deletes the batch job.

We may use -W to avoid waiting. The submission will exit without monitoring. tmrmonitor, tmrdownload, tmrdeljob will be used to manually manuscript the submission

Run Boltz

Model should support two additional arguments, we will use Boltz as the example.

tmrrun boltz --debug
	The --debug option will enable the API calls to print out the response text

tmrrun boltz --setting='{"version":"1.0.0"}' -n myrun -o output_folder input.csv
	Model has default settings, one can overwrite it by providing --setting with a JSON string.
	Here the default boltz model is version 2, but we can overwrite it to use boltz-1.

Developer

We may use tamarind/model/alphafold.py as an example to learn how to use tamarind/tamarind.py to interact with Tamarind. The command line tools are also good examples to learn the main methods.

To support a new model, we clone alphafold.py and modify. Let us use boltz.py as the example (which is simpler than alphafold.py). We must name the .py file using the exact model name that match job_type. The class name must be App. The default parameters (settings) can be obtain from API document, pick the model from the API drop down. AlphaFold can support custom template files, so there are extra logic to handle file upload. We upload the unique set of user files into the folder named by the batch_name. These files will be deleted by deljob and delfile by the batch name. We may implement a results() method, that will compile one merged metrics file per batch. This file is useful to find the best predicted model for each job entry.

Python Scripting

We can use model code in Python by

import tamarind.model.alphafold.App as AlphaFold
app = AlphaFold()
# provide a list of names, sequences, and optionally custom templates
# wait controls whether we wait till all predictions are completed
app.batch(self, "mybatch", S_name, S_seq, S_template, output_folder=".", wait=True)
# if do not want to keep, delete the batch job entries
app.delete()

# if we need to download the results
app.download_batch("mybatch", output_folder="out_mybatch")
# if model support merging metrics for the batch
AlphaFold.results(output_folder="out_mybatch")

About

A prototype of Tamarind.bio API wrapper

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages