Skip to content

[STDLIB_STATS] need to upgrade stdlib_stats codes about compilation efficiency #438

Closed
@zoziha

Description

@zoziha

Overview: Compilation time is too long.

When compiling, I found that compiling stdlib_stats uses a lot of computer resources, especially RAM, which is related to the high-dimensional matrix dimensions defined in stdlib_stats, which greatly reduces the efficiency of stdlib and improves the overall compilation time of stdlib.
image

It took my computer (CPU: intel i5 8250U) more than two hours to compile stdlib completely,
image
When RANK=15, the compiled volume of stdlib reached 747MB.

I took a quick look at the source code and thought that there might be a better way to replace the polymorphic interface with such a large number of multi-dimensional array arguments.
(see high-dimensional matrix dimensions)
(see RANK)

My understanding is: Rethink, need to be more flexible.

The length within a single dimension defined by Fortran can theoretically be infinitely expanded, but the number of dimensions needs to be manually defined by the user.
In the future, we will also build a large number of functions that use matrices. The current implementation of stdlib_stats is unreasonable, not adaptable and needs to be improved, (see stdlib_stats_moment.fypp).

stdlib_stats presets several basic dimensions to form a polymorphic interface, and sets multiple judgments (see condition judgments) on the number of processing dimensions, resulting in a decrease in compilation speed and an increase in compilation load.

#281
#283

My solution is: Set up a matrix parser, or use a single-dimensional matrix algorithm.

If it is not for the communication within the different dimensions, we can achieve the effect by only setting the one-dimensional column vector, and hand the specific dimensional operation to the user to improve the versatility and flexibility of stdlib.

Or we use the wiki solution in stdlib to set up a matrix parser and transform it when necessary to meet the polymorphic needs of multi-dimensional arrays.

I have seen another library, and its solution is also good: muesli!


I don't know much more about stdlib_stats, so there may be limitations of my idea. However, I think the multi-dimensional array polymorphic interface in stdlib_stats needs to be improved.
Hope to get the discussion, thank you all! 😍

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workingbuild: cmakeIssue with stdlib's CMake build filesdocumentationImprovements or additions to documentation

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions