Skip to content

Unable to load shared library 'CpuMathNative' or one of its dependencies. #5299

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
emilylawton opened this issue Jul 10, 2020 · 3 comments
Closed
Assignees
Labels
AutoML.NET Automating various steps of the machine learning process P2 Priority of the issue for triage purpose: Needs to be fixed at some point.

Comments

@emilylawton
Copy link

System information

  • .NET Version (eg., dotnet --info): 3.1

Issue

  • What did you do?
    Ran the following command from a published AML experiment pipeline:
    maml.exe TrainTest test=inputs/test.tsv tr=LogisticRegression scorer=BinaryClassifierScorer eval=BinaryClassifierEvaluator norm=No cache=+ dout=outputs/pred.tsv loader=TextLoader{col=Name:TX:3 col=Features:R4:4-222 col=Label:R4:0 header=+} data=inputs/train.tsv out=outputs/model.zip seed=137

  • What happened?

(1) Unexpected exception: One or more errors occurred. (Unable to load shared library 'CpuMathNative' or one of its dependencies. In order to help diagnose loading problems, consider setting the LD_DEBUG environment variable: libCpuMathNative: cannot open shared object file: No such file or directory) (Unable to load shared library 'CpuMathNative' or one of its dependencies. In order to help diagnose loading problems, consider setting the LD_DEBUG environment variable: libCpuMathNative: cannot open shared object file: No such file or directory) (Unable to load shared library 'CpuMathNative' or one of its dependencies. In order to help diagnose loading problems, consider setting the LD_DEBUG environment variable: libCpuMathNative: cannot open shared object file: No such file or directory) (Unable to load shared library 'CpuMathNative' or one of its dependencies. In order to help diagnose loading problems, consider setting the LD_DEBUG environment variable: libCpuMathNative: cannot open shared object file: No such file or directory) (Unable to load shared library 'CpuMathNative' or one of its dependencies. In order to help diagnose loading problems, consider setting the LD_DEBUG environment variable: libCpuMathNative: cannot open shared object file: No such file or directory) (Unable to load shared library 'CpuMathNative' or one of its dependencies. In order to help diagnose loading problems, consider setting the LD_DEBUG environment variable: libCpuMathNative: cannot open shared object file: No such file or directory) (Unable to load shared library 'CpuMathNative' or one of its dependencies. In order to help diagnose loading problems, consider setting the LD_DEBUG environment variable: libCpuMathNative: cannot open shared object file: No such file or directory) (Unable to load shared library 'CpuMathNative' or one of its dependencies. In order to help diagnose loading problems, consider setting the LD_DEBUG environment variable: libCpuMathNative: cannot open shared object file: No such file or directory) (Unable to load shared library 'CpuMathNative' or one of its dependencies. In order to help diagnose loading problems, consider setting the LD_DEBUG environment variable: libCpuMathNative: cannot open shared object file: No such file or directory) (Unable to load shared library 'CpuMathNative' or one of its dependencies. In order to help diagnose loading problems, consider setting the LD_DEBUG environment variable: libCpuMathNative: cannot open shared object file: No such file or directory) (Unable to load shared library 'CpuMathNative' or one of its dependencies. In order to help diagnose loading problems, consider setting the LD_DEBUG environment variable: libCpuMathNative: cannot open shared object file: No such file or directory) (Unable to load shared library 'CpuMathNative' or one of its dependencies. In order to help diagnose loading problems, consider setting the LD_DEBUG environment variable: libCpuMathNative: cannot open shared object file: No such file or directory) (Unable to load shared library 'CpuMathNative' or one of its dependencies. In order to help diagnose loading problems, consider setting the LD_DEBUG environment variable: libCpuMathNative: cannot open shared object file: No such file or directory) (Unable to load shared library 'CpuMathNative' or one of its dependencies. In order to help diagnose loading problems, consider setting the LD_DEBUG environment variable: libCpuMathNative: cannot open shared object file: No such file or directory) (Unable to load shared library 'CpuMathNative' or one of its dependencies. In order to help diagnose loading problems, consider setting the LD_DEBUG environment variable: libCpuMathNative: cannot open shared object file: No such file or directory) (Unable to load shared library 'CpuMathNative' or one of its dependencies. In order to help diagnose loading problems, consider setting the LD_DEBUG environment variable: libCpuMathNative: cannot open shared object file: No such file or directory) (Unable to load shared library 'CpuMathNative' or one of its dependencies. In order to help diagnose loading problems, consider setting the LD_DEBUG environment variable: libCpuMathNative: cannot open shared object file: No such file or directory) (Unable to load shared library 'CpuMathNative' or one of its dependencies. In order to help diagnose loading problems, consider setting the LD_DEBUG environment variable: libCpuMathNative: cannot open shared object file: No such file or directory) (Unable to load shared library 'CpuMathNative' or one of its dependencies. In order to help diagnose loading problems, consider setting the LD_DEBUG environment variable: libCpuMathNative: cannot open shared object file: No such file or directory) (Unable to load shared library 'CpuMathNative' or one of its dependencies. In order to help diagnose loading problems, consider setting the LD_DEBUG environment variable: libCpuMathNative: cannot open shared object file: No such file or directory), 'System.AggregateException'
   at System.Threading.Tasks.TaskReplicator.Run[TState](ReplicatableUserAction`1 action, ParallelOptions options, Boolean stopOnFirstFailure)
   at System.Threading.Tasks.Parallel.ForWorker[TLocal](Int32 fromInclusive, Int32 toExclusive, ParallelOptions parallelOptions, Action`1 body, Action`2 bodyWithState, Func`4 bodyWithLocal, Func`1 localInit, Action`1 localFinally)
--- End of stack trace from previous location where exception was thrown ---
   at System.Threading.Tasks.Parallel.ThrowSingleCancellationExceptionOrOtherException(ICollection exceptions, CancellationToken cancelToken, Exception otherException)
   at System.Threading.Tasks.Parallel.ForWorker[TLocal](Int32 fromInclusive, Int32 toExclusive, ParallelOptions parallelOptions, Action`1 body, Action`2 bodyWithState, Func`4 bodyWithLocal, Func`1 localInit, Action`1 localFinally)
   at System.Threading.Tasks.Parallel.For(Int32 fromInclusive, Int32 toExclusive, Action`1 body)
   at Microsoft.ML.Trainers.LbfgsTrainerBase`3.DifferentiableFunctionMultithreaded(VBuffer`1& xDense, VBuffer`1& gradient, IProgressChannel pch) in /machinelearning/src/Microsoft.ML.StandardTrainers/Standard/LogisticRegression/LbfgsPredictorBase.cs:line 698
   at Microsoft.ML.Trainers.LbfgsTrainerBase`3.DifferentiableFunction(VBuffer`1& x, VBuffer`1& gradient, IProgressChannelProvider progress) in /machinelearning/src/Microsoft.ML.StandardTrainers/Standard/LogisticRegression/LbfgsPredictorBase.cs:line 641
   at Microsoft.ML.Numeric.L1Optimizer.L1OptimizerState.EvalCore(VBuffer`1& input, VBuffer`1& gradient, IProgressChannelProvider progress) in /machinelearning/src/Microsoft.ML.StandardTrainers/Optimizer/L1Optimizer.cs:line 119
   at Microsoft.ML.Numeric.Optimizer.OptimizerState.Init() in /machinelearning/src/Microsoft.ML.StandardTrainers/Optimizer/Optimizer.cs:line 241
   at Microsoft.ML.Numeric.L1Optimizer.MakeState(IChannel ch, IProgressChannelProvider progress, DifferentiableFunction function, VBuffer`1& initial) in /machinelearning/src/Microsoft.ML.StandardTrainers/Optimizer/L1Optimizer.cs:line 59
   at Microsoft.ML.Numeric.Optimizer.Minimize(DifferentiableFunction function, VBuffer`1& initial, ITerminationCriterion term, VBuffer`1& result, Single& optimum) in /machinelearning/src/Microsoft.ML.StandardTrainers/Optimizer/Optimizer.cs:line 611
   at Microsoft.ML.Trainers.LbfgsTrainerBase`3.TrainCore(IChannel ch, RoleMappedData data) in /machinelearning/src/Microsoft.ML.StandardTrainers/Standard/LogisticRegression/LbfgsPredictorBase.cs:line 573
   at Microsoft.ML.Trainers.LbfgsTrainerBase`3.TrainModelCore(TrainContext context) in /machinelearning/src/Microsoft.ML.StandardTrainers/Standard/LogisticRegression/LbfgsPredictorBase.cs:line 433
   at Microsoft.ML.Trainers.TrainerEstimatorBase`2.Microsoft.ML.ITrainer<Microsoft.ML.IPredictor>.Train(TrainContext context) in /machinelearning/src/Microsoft.ML.Data/Training/TrainerEstimatorBase.cs:line 100
   at Microsoft.ML.Data.TrainUtils.TrainCore(IHostEnvironment env, IChannel ch, RoleMappedData data, ITrainer trainer, RoleMappedData validData, IComponentFactory`1 calibrator, Int32 maxCalibrationExamples, Nullable`1 cacheData, IPredictor inputPredictor, RoleMappedData testData) in /machinelearning/src/Microsoft.ML.Data/Commands/TrainCommand.cs:line 280
   at Microsoft.ML.Data.TrainTestCommand.RunCore(IChannel ch, String cmd) in /machinelearning/src/Microsoft.ML.Data/Commands/TrainTestCommand.cs:line 186
   at Microsoft.ML.Data.TrainTestCommand.Run() in /machinelearning/src/Microsoft.ML.Data/Commands/TrainTestCommand.cs:line 108
   at Microsoft.ML.Tools.Maml.MainCore(IHostEnvironment env, String args, Boolean alwaysPrintStacktrace) in /machinelearning/src/Microsoft.ML.Maml/MAML.cs:line 142
  • What did you expect?
    We expect a trained model, not this error.

Source code / logs

Please paste or attach the code or logs or traces that would be helpful to diagnose the issue you are reporting.
Docker file:

FROM mcr.microsoft.com/dotnet/core/sdk:3.1 AS builder

RUN apt-get update 
RUN apt-get install -y git cmake clang-3.9 libomp-dev
RUN git clone https://github.com/dotnet/machinelearning.git

RUN cd /machinelearning &&\
    git submodule update --init &&\
    bash build.sh -release


RUN mkdir /mlnet_ &&\
    dotnet publish -c Release --no-build  machinelearning/src/Microsoft.ML.Console --output mlnet_ --self-contained false

RUN mkdir /mlnet &&\
    cp -RL /mlnet_/* /mlnet/

# RUN cp -r /mlnet /mlnet_all

RUN rm -rf /mlnet/runtimes/osx-x64 &&\
    rm -rf /mlnet/runtimes/win &&\
    rm -rf /mlnet/runtimes/win-x64 &&\
    rm -rf /mlnet/runtimes/win-x86



FROM mcr.microsoft.com/azureml/o16n-sample-user-base/ubuntu-miniconda

RUN apt-get update &&\
    apt-get install -y apt-transport-https fuse

RUN wget -qO- https://packages.microsoft.com/keys/microsoft.asc | gpg --dearmor > microsoft.asc.gpg
RUN mv microsoft.asc.gpg /etc/apt/trusted.gpg.d/
RUN chown root:root /etc/apt/trusted.gpg.d/microsoft.asc.gpg

RUN wget -q https://packages.microsoft.com/config/debian/10/prod.list
RUN mv prod.list /etc/apt/sources.list.d/microsoft-prod.list
RUN chown root:root /etc/apt/sources.list.d/microsoft-prod.list

RUN apt-get update &&\
    apt-get install -y dotnet-runtime-2.1

COPY --from=builder /mlnet /mlnet/.

RUN ldconfig -n /mlnet
ENV LD_LIBRARY_PATH=/mlnet/runtimes/linux-x64/native:/mlnet/runtimes/unix/lib/netcoreapp2.0:$LD_LIBRARY_PATH
@frank-dong-ms-zz frank-dong-ms-zz added P1 Priority of the issue for triage purpose: Needs to be fixed soon. AutoML.NET Automating various steps of the machine learning process labels Jul 10, 2020
@frank-dong-ms-zz
Copy link
Contributor

@emilylawton Thanks for using ML.NET, I'm not sure what following means, are you using ML.NET directly build from code or from Nuget?

a published AML experiment pipeline

If you are using from nuget package then what version of ML.NET are you using? Please check out below issues that are related:
#4870
#4483

@frank-dong-ms-zz
Copy link
Contributor

frank-dong-ms-zz commented Jul 10, 2020

Offline discussed with Emily about this issue. Seems Microsoft.ML.Console output didn't contains CpuMathNative dependency, not sure if this is on purpose or bug.

There are several things to try out to mitigate this issue:

  1. try to use dotnet core 3.1 so you won't need to dependent on CpuMathNative, that is to change configuration from "Release" to "Release-netcoreapp3_1" on both build and publish stage:
    bash build.sh -release --> bash build.sh -Release-netcoreapp3_1
    dotnet publish -c Release --no-build machinelearning/src/Microsoft.ML.Console --output mlnet_ --self-contained false --> dotnet publish -c Release-netcoreapp3_1 --no-build machinelearning/src/Microsoft.ML.Console --output mlnet_ --self-contained false
  2. add Native path to LD_LIBRARY_PATH environment variable by modify the last line of docker file to:
    ENV LD_LIBRARY_PATH=/machinelearning/bin/x64.Release/Native:/mlnet/runtimes/linux-x64/native:/mlnet/runtimes/unix/lib/netcoreapp2.0:$LD_LIBRARY_PATH

@emilylawton please let me whether these suggestions work for you, thanks.

@frank-dong-ms-zz frank-dong-ms-zz self-assigned this Jul 11, 2020
@frank-dong-ms-zz frank-dong-ms-zz added P2 Priority of the issue for triage purpose: Needs to be fixed at some point. and removed P1 Priority of the issue for triage purpose: Needs to be fixed soon. labels Jul 13, 2020
@emilylawton
Copy link
Author

emilylawton commented Jul 14, 2020

@frank-dong-ms, a modification of suggestion 2 did the trick. Thank you!

@ghost ghost locked as resolved and limited conversation to collaborators Mar 18, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
AutoML.NET Automating various steps of the machine learning process P2 Priority of the issue for triage purpose: Needs to be fixed at some point.
Projects
None yet
Development

No branches or pull requests

2 participants