Skip to content

UnitTestConstruction

Maxie D. Schmidt edited this page Dec 19, 2021 · 1 revision

GTFoldPython library unit tests

Construction of the unit tests

The purpose of putting several unit tests in place is to guarantee consistency and accuracy of the implementation of the Python bindings we have written over time. We have to rely on the output of the original GTFold shell utility to generate output since auxilliary programs like ViennaRNA and RNAStructure use significantly different methods to generate their folding data. The command line specification for GTFold is found in this online documentation. We use the cmake-enabled build of GTFold forked here.

Commands used to generate the testing data

Unit test 1 (no constraints)

$ ./bin/gtmfe -v -d2 -m -dS --exactintloop -m --prefilter 2 ../GTFoldPython/Python/Testing/TestData/5S/E.coli.fa.seq 
GTfold: A Scalable Multicore Code for RNA Secondary Structure Prediction
(c) 2007-2011  D.A. Bader, C.E. Heitsch, S.C. Harvey
Georgia Institute of Technology

Checking for environ variable 'GTFOLDDATADIR', found 
Run Configuration:
+ enabled terminal mismatch calculations
+ running with prefilter value = 2
- thermodynamic parameters: /Users/mschmidt34/GTDMMBSoftware/GTFoldPython/Python/Testing/ExtraGTFoldThermoData/GTFoldTurner99/
- input file: ../GTFoldPython/Python/Testing/TestData/5S/E.coli.fa.seq
- sequence length: 120
- output file: E.coli.fa.ct

Computing minimum free energy structure...
Done.

Results:
- Minimum Free Energy:     -55.0000 kcal/mol
- MFE runtime:  0.006120 seconds


UGCCUGGCGGCCGUAGCGCGGUGGUCCCACCUGACCCCAUGCCGAACUCAGAAGUGAAACGCCGUAGCGCCGAUGGUAGUGUGGGGUCUCCCCAUGCGAGAGUAGGGAACUGCCAGGCAU
((((((((((.........((((....)))).((((((((((.....(((....)))...(((((.......))))).))))))))))..(((.(((....))))))..)))))))))).

MFE structure saved in .ct format to E.coli.fa.ct

NOTE: Specifying the historical --rnafold option to the GTFold gtmfe utility changes the MFE results slightly. This is not set by default within the Python bindings code, so it can generate failed unit tests without setting up options in the test runners first.

Unit test 2 (with constraints)

$ ./bin/gtmfe -v -d2 -m -dS --exactintloop -m --prefilter 2 ../GTFoldPython/Python/Testing/TestData/5S/E.coli.fa.seq -c ../GTFoldPython/Python/Testing/TestData/5S/E.coli.fa.cons
GTfold: A Scalable Multicore Code for RNA Secondary Structure Prediction
(c) 2007-2011  D.A. Bader, C.E. Heitsch, S.C. Harvey
Georgia Institute of Technology

- Running with constraints
Run Configuration:
+ enabled terminal mismatch calculations
+ running with prefilter value = 2
- using constraint file: ../GTFoldPython/Python/Testing/TestData/5S/E.coli.fa.cons
- thermodynamic parameters: /Users/mschmidt34/GTDMMBSoftware/gtfold/gtfold-mfe/data/Turner99/
- input file: ../GTFoldPython/Python/Testing/TestData/5S/E.coli.fa.seq
- sequence length: 120
- output file: E.coli.fa.ct

Computing minimum free energy structure...
Done.

Results:
- Minimum Free Energy:     -55.0000 kcal/mol
- MFE runtime:  0.010924 seconds


UGCCUGGCGGCCGUAGCGCGGUGGUCCCACCUGACCCCAUGCCGAACUCAGAAGUGAAACGCCGUAGCGCCGAUGGUAGUGUGGGGUCUCCCCAUGCGAGAGUAGGGAACUGCCAGGCAU
((((((((((.........((((....)))).((((((((((.....(((....)))...(((((.......))))).))))))))))..(((.(((....))))))..)))))))))).
...............PPPPPPPP....................................PPPPPP.PP....................................................

MFE structure saved in .ct format to E.coli.fa.ct
Verifying that structure fulfills constraint criteria... OK

Running the unit tests (ALL possible tests)

The next section describes how to run only selected unit tests. The procedure below will perform the time consuming task of running ALL of the unit tests. From the main GTFoldPython directory issue the following commands (this should take a while):

cd Python 
make test

Successful unit test output should resemble the following:

|| GTFOLD-PYTHON UNIT TEST INFO: (test1_16S_K00421)
    ## 1
    >> Test Purpose:    Basic MFE calculation without constraints
    >> Organism:        d.16.a.H.volcanii.bpseq
    >> Base Sequence:   [#1474] AUUCCGGU ... CUGGAUCACCUCCUG
    >> Constraints:     None enabled
////
 ... TEST PASSED! [OK]
.


|| GTFOLD-PYTHON UNIT TEST INFO: (test2_5S_EColiFa)
    ## 2
    >> Test Purpose:    Basic MFE calculation without constraints
    >> Organism:         E. coli.fs (Native structure) -- 5S, rRNA
    >> Base Sequence:   [#120] UGCCUGGC ... GGAACUGCCAGGCAU
    >> Constraints:     None enabled
////
 ... TEST PASSED! [OK]
.

|| GTFOLD-PYTHON UNIT TEST INFO: (test2_5S_EColiFa_withcons_RNAfold)
    ## 3
    >> Test Purpose:    Basic MFE calculation *with* constraints
    >> Organism:         E. coli.fs (Native structure) -- 5S, rRNA
    >> Base Sequence:   [#120] UGCCUGGC ... GGAACUGCCAGGCAU
    >> Constraints:     8 Total, 0 Forced, 8 Prohibited
////

 ... TEST PASSED! [OK]
.


|| GTFOLD-PYTHON UNIT TEST INFO: (test3_tRNA_yeastFa)
    ## 4
    >> Test Purpose:    Basic MFE calculation without constraints
    >> Organism:        tRNA(asp), yeast.fa (Native structure) ENERGY = -34.3
    >> Base Sequence:   [#75] GCCGUGAU ... CCCGUCGCGGCGCCA
    >> Constraints:     None enabled
////
 ... TEST PASSED! [OK]
.

|| GTFOLD-PYTHON UNIT TEST INFO: (test3_tRNA_yeastFa_withcons_RNAfold)
    ## 5
    >> Test Purpose:    Basic MFE calculation *with* constraints
    >> Organism:        tRNA(asp), yeast.fa (Native structure) ENERGY = -34.3
    >> Base Sequence:   [#75] GCCGUGAU ... CCCGUCGCGGCGCCA
    >> Constraints:     325 Total, 7 Forced, 318 Prohibited
////

 ... TEST PASSED! [OK]
.


|| GTFOLD-PYTHON UNIT TEST INFO: (test4_other_humanFa)
    ## 6
    >> Test Purpose:    Basic MFE calculation without constraints
    >> Organism:        Telomerase pseudoknot, human.fa (Native structure)
    >> Base Sequence:   [#47] GGGCUGUU ... ACAAAAAAAGUCAGC
    >> Constraints:     None enabled
////
 ... TEST PASSED! [OK]
.

|| GTFOLD-PYTHON UNIT TEST INFO: (test4_other_humanFa_withcons_RNAfold)
    ## 7
    >> Test Purpose:    Basic MFE calculation *with* constraints
    >> Organism:        Telomerase pseudoknot, human.fa (Native structure)
    >> Base Sequence:   [#47] GGGCUGUU ... ACAAAAAAAGUCAGC
    >> Constraints:     5 Total, 3 Forced, 2 Prohibited
////

 ... TEST PASSED! [OK]
.


|| GTFOLD-PYTHON UNIT TEST INFO: (test5_other_PSyringae)
    ## 8
    >> Test Purpose:    Basic MFE calculation without constraints
    >> Organism:        ENERGY = -45.2  F Sensing RS (Native structure)
    >> Base Sequence:   [#66] GCAUUGGA ... GAUGAUGCCUACAGA
    >> Constraints:     None enabled
////
 ... TEST PASSED! [OK]
.

|| GTFOLD-PYTHON UNIT TEST INFO: (test5_other_PSyringae_withcons_RNAfold)
    ## 9
    >> Test Purpose:    Basic MFE calculation *with* constraints
    >> Organism:        ENERGY = -45.2  F Sensing RS (Native structure)
    >> Base Sequence:   [#66] GCAUUGGA ... GAUGAUGCCUACAGA
    >> Constraints:     6 Total, 6 Forced, 0 Prohibited
////

 ... TEST PASSED! [OK]
.
----------------------------------------------------------------------
Ran 9 tests in 20.284s

OK

Running the unit tests (selecting test groups by type)

It is possible to run only a subset of the unit tests. The following is the enumeration of logically OR'ed values to set to run the subclasses of the test cases:

    NO_TESTS                = 0
    MFE_BASE_TESTS          = 1
    MFE_ROGERS_RNADB_TESTS  = 2
    MFE_CUSTOM_PARAMS_TESTS = 4
    PFUNC_TESTS             = 8
    SUBOPT_TESTS            = 16
    NOT_PASSING             = 32
    ALL_TESTS               = 0xffff

We export a hexadecimally formatted integer string to the environment variable GTFP_TEST_SUITE indicating which test types to run, and then run only those tests as follows:

# ... TO RUN ALL TESTS: 
$ export GTFP_TEST_SUITE=0xffff
$ make && make test
$ unset GTFP_TEST_SUITE

For example, to run only the unit tests that are not passing (convenient for development work), run the following sequence of commands:

$ export GTFP_TEST_SUITE=0x0020
$ make && make test
$ unset GTFP_TEST_SUITE

Here, we note that 32 (base-10) is denoted as a short 16-bit integer by the hexadecimal 0x0020.

Notes: Equivalent command line GTFold switches used to run the unit tests

We have the following source code in the GTFoldPython unit test class to pre-configure special options before calling the GTFold routines:

    def setUp(self):
        GTFoldPython._ConstructLibGTFold()
        GTFoldPython.Config(debugging = False)
        #GTFoldPython.Config(verbose=1, debugging=1)
        GTFoldPython.EnableTerminalMismatch()
        GTFoldPython.SetThermodynamicParametersFromDefaults("Turner99")
        self._testPurpose = ""
        self._testConstraints = []
        self._orgName = ""
        self._orgAccNo = ""
        self._orgBaseSeq = ""
        self._testResultExpectedMFE = None
        self._testResultExpectedMFEStruct = None
    ##

These correspond to setting the historical (command line utility syntax) options of -m (for terminal mismatch) and setting the default energy model to the Turner99 model data. Note that to set this precise energy model using the command line utility variants of GTFold, it may be necessary to override the default locations by setting an env-var export on the terminal first:

$ export GTFOLDDATADIR=$(greadlink -f ~/GTFoldPython/Python/Testing/ExtraGTFoldThermoData/GTFoldTurner99)
# Then call: gtmfe [...]

Some of the options passed to gtmfe in the above examples are algorithmic in nature (cf. -d2 -dS --exactintloop -m --prefilter 2 where the -v switch produces verbose output printing). That is, the implementation of the source code that gets called runs differently, but the results that are produced will not vary based on passing these options to the utility.