-
Notifications
You must be signed in to change notification settings - Fork 0
UnitTestConstruction
The purpose of putting several unit tests in place is to guarantee
consistency and accuracy of the implementation of the Python
bindings we have written over time.
We have to rely on the output of the original GTFold
shell utility
to generate output since auxilliary programs like ViennaRNA
and
RNAStructure
use significantly different methods to generate
their folding data. The command line specification for GTFold
is
found in
this online documentation.
We use the cmake
-enabled build of GTFold forked here.
$ ./bin/gtmfe -v -d2 -m -dS --exactintloop -m --prefilter 2 ../GTFoldPython/Python/Testing/TestData/5S/E.coli.fa.seq
GTfold: A Scalable Multicore Code for RNA Secondary Structure Prediction
(c) 2007-2011 D.A. Bader, C.E. Heitsch, S.C. Harvey
Georgia Institute of Technology
Checking for environ variable 'GTFOLDDATADIR', found
Run Configuration:
+ enabled terminal mismatch calculations
+ running with prefilter value = 2
- thermodynamic parameters: /Users/mschmidt34/GTDMMBSoftware/GTFoldPython/Python/Testing/ExtraGTFoldThermoData/GTFoldTurner99/
- input file: ../GTFoldPython/Python/Testing/TestData/5S/E.coli.fa.seq
- sequence length: 120
- output file: E.coli.fa.ct
Computing minimum free energy structure...
Done.
Results:
- Minimum Free Energy: -55.0000 kcal/mol
- MFE runtime: 0.006120 seconds
UGCCUGGCGGCCGUAGCGCGGUGGUCCCACCUGACCCCAUGCCGAACUCAGAAGUGAAACGCCGUAGCGCCGAUGGUAGUGUGGGGUCUCCCCAUGCGAGAGUAGGGAACUGCCAGGCAU
((((((((((.........((((....)))).((((((((((.....(((....)))...(((((.......))))).))))))))))..(((.(((....))))))..)))))))))).
MFE structure saved in .ct format to E.coli.fa.ct
NOTE: Specifying the historical --rnafold
option to the GTFold gtmfe
utility changes the MFE results slightly. This is not set by default within the Python bindings code, so it can generate failed unit tests without setting up options in the test runners first.
$ ./bin/gtmfe -v -d2 -m -dS --exactintloop -m --prefilter 2 ../GTFoldPython/Python/Testing/TestData/5S/E.coli.fa.seq -c ../GTFoldPython/Python/Testing/TestData/5S/E.coli.fa.cons
GTfold: A Scalable Multicore Code for RNA Secondary Structure Prediction
(c) 2007-2011 D.A. Bader, C.E. Heitsch, S.C. Harvey
Georgia Institute of Technology
- Running with constraints
Run Configuration:
+ enabled terminal mismatch calculations
+ running with prefilter value = 2
- using constraint file: ../GTFoldPython/Python/Testing/TestData/5S/E.coli.fa.cons
- thermodynamic parameters: /Users/mschmidt34/GTDMMBSoftware/gtfold/gtfold-mfe/data/Turner99/
- input file: ../GTFoldPython/Python/Testing/TestData/5S/E.coli.fa.seq
- sequence length: 120
- output file: E.coli.fa.ct
Computing minimum free energy structure...
Done.
Results:
- Minimum Free Energy: -55.0000 kcal/mol
- MFE runtime: 0.010924 seconds
UGCCUGGCGGCCGUAGCGCGGUGGUCCCACCUGACCCCAUGCCGAACUCAGAAGUGAAACGCCGUAGCGCCGAUGGUAGUGUGGGGUCUCCCCAUGCGAGAGUAGGGAACUGCCAGGCAU
((((((((((.........((((....)))).((((((((((.....(((....)))...(((((.......))))).))))))))))..(((.(((....))))))..)))))))))).
...............PPPPPPPP....................................PPPPPP.PP....................................................
MFE structure saved in .ct format to E.coli.fa.ct
Verifying that structure fulfills constraint criteria... OK
The next section describes how to run only selected unit tests.
The procedure below will perform the time consuming task of
running ALL of the unit tests.
From the main GTFoldPython
directory issue the following
commands (this should take a while):
cd Python
make test
Successful unit test output should resemble the following:
|| GTFOLD-PYTHON UNIT TEST INFO: (test1_16S_K00421)
## 1
>> Test Purpose: Basic MFE calculation without constraints
>> Organism: d.16.a.H.volcanii.bpseq
>> Base Sequence: [#1474] AUUCCGGU ... CUGGAUCACCUCCUG
>> Constraints: None enabled
////
... TEST PASSED! [OK]
.
|| GTFOLD-PYTHON UNIT TEST INFO: (test2_5S_EColiFa)
## 2
>> Test Purpose: Basic MFE calculation without constraints
>> Organism: E. coli.fs (Native structure) -- 5S, rRNA
>> Base Sequence: [#120] UGCCUGGC ... GGAACUGCCAGGCAU
>> Constraints: None enabled
////
... TEST PASSED! [OK]
.
|| GTFOLD-PYTHON UNIT TEST INFO: (test2_5S_EColiFa_withcons_RNAfold)
## 3
>> Test Purpose: Basic MFE calculation *with* constraints
>> Organism: E. coli.fs (Native structure) -- 5S, rRNA
>> Base Sequence: [#120] UGCCUGGC ... GGAACUGCCAGGCAU
>> Constraints: 8 Total, 0 Forced, 8 Prohibited
////
... TEST PASSED! [OK]
.
|| GTFOLD-PYTHON UNIT TEST INFO: (test3_tRNA_yeastFa)
## 4
>> Test Purpose: Basic MFE calculation without constraints
>> Organism: tRNA(asp), yeast.fa (Native structure) ENERGY = -34.3
>> Base Sequence: [#75] GCCGUGAU ... CCCGUCGCGGCGCCA
>> Constraints: None enabled
////
... TEST PASSED! [OK]
.
|| GTFOLD-PYTHON UNIT TEST INFO: (test3_tRNA_yeastFa_withcons_RNAfold)
## 5
>> Test Purpose: Basic MFE calculation *with* constraints
>> Organism: tRNA(asp), yeast.fa (Native structure) ENERGY = -34.3
>> Base Sequence: [#75] GCCGUGAU ... CCCGUCGCGGCGCCA
>> Constraints: 325 Total, 7 Forced, 318 Prohibited
////
... TEST PASSED! [OK]
.
|| GTFOLD-PYTHON UNIT TEST INFO: (test4_other_humanFa)
## 6
>> Test Purpose: Basic MFE calculation without constraints
>> Organism: Telomerase pseudoknot, human.fa (Native structure)
>> Base Sequence: [#47] GGGCUGUU ... ACAAAAAAAGUCAGC
>> Constraints: None enabled
////
... TEST PASSED! [OK]
.
|| GTFOLD-PYTHON UNIT TEST INFO: (test4_other_humanFa_withcons_RNAfold)
## 7
>> Test Purpose: Basic MFE calculation *with* constraints
>> Organism: Telomerase pseudoknot, human.fa (Native structure)
>> Base Sequence: [#47] GGGCUGUU ... ACAAAAAAAGUCAGC
>> Constraints: 5 Total, 3 Forced, 2 Prohibited
////
... TEST PASSED! [OK]
.
|| GTFOLD-PYTHON UNIT TEST INFO: (test5_other_PSyringae)
## 8
>> Test Purpose: Basic MFE calculation without constraints
>> Organism: ENERGY = -45.2 F Sensing RS (Native structure)
>> Base Sequence: [#66] GCAUUGGA ... GAUGAUGCCUACAGA
>> Constraints: None enabled
////
... TEST PASSED! [OK]
.
|| GTFOLD-PYTHON UNIT TEST INFO: (test5_other_PSyringae_withcons_RNAfold)
## 9
>> Test Purpose: Basic MFE calculation *with* constraints
>> Organism: ENERGY = -45.2 F Sensing RS (Native structure)
>> Base Sequence: [#66] GCAUUGGA ... GAUGAUGCCUACAGA
>> Constraints: 6 Total, 6 Forced, 0 Prohibited
////
... TEST PASSED! [OK]
.
----------------------------------------------------------------------
Ran 9 tests in 20.284s
OK
It is possible to run only a subset of the unit tests. The following is the enumeration of logically OR'ed values to set to run the subclasses of the test cases:
NO_TESTS = 0
MFE_BASE_TESTS = 1
MFE_ROGERS_RNADB_TESTS = 2
MFE_CUSTOM_PARAMS_TESTS = 4
PFUNC_TESTS = 8
SUBOPT_TESTS = 16
NOT_PASSING = 32
ALL_TESTS = 0xffff
We export a hexadecimally formatted integer string to the environment variable GTFP_TEST_SUITE
indicating which test types to run, and then run only those tests as follows:
# ... TO RUN ALL TESTS:
$ export GTFP_TEST_SUITE=0xffff
$ make && make test
$ unset GTFP_TEST_SUITE
For example, to run only the unit tests that are not passing (convenient for development work), run the following sequence of commands:
$ export GTFP_TEST_SUITE=0x0020
$ make && make test
$ unset GTFP_TEST_SUITE
Here, we note that 32
(base-10) is denoted as a short 16-bit integer by the hexadecimal 0x0020
.
We have the following source code in the GTFoldPython unit test class to pre-configure special options before calling the GTFold routines:
def setUp(self):
GTFoldPython._ConstructLibGTFold()
GTFoldPython.Config(debugging = False)
#GTFoldPython.Config(verbose=1, debugging=1)
GTFoldPython.EnableTerminalMismatch()
GTFoldPython.SetThermodynamicParametersFromDefaults("Turner99")
self._testPurpose = ""
self._testConstraints = []
self._orgName = ""
self._orgAccNo = ""
self._orgBaseSeq = ""
self._testResultExpectedMFE = None
self._testResultExpectedMFEStruct = None
##
These correspond to setting the historical (command line utility syntax) options of -m
(for terminal mismatch) and setting the default energy model to the Turner99 model data. Note that to set this precise energy model using the command line utility variants of GTFold, it may be necessary to override the default locations by setting an env-var export on the terminal first:
$ export GTFOLDDATADIR=$(greadlink -f ~/GTFoldPython/Python/Testing/ExtraGTFoldThermoData/GTFoldTurner99)
# Then call: gtmfe [...]
Some of the options passed to gtmfe
in the above examples are algorithmic in nature (cf. -d2 -dS --exactintloop -m --prefilter 2
where the -v
switch produces verbose output printing). That is, the implementation of the source code that gets called runs differently, but the results that are produced will not vary based on passing these options to the utility.