Troubleshooting

The content you find here, is a collection of issues commonly experienced by all of us at some point. Please use the navigation list on the right, to begin with the section you are experiencing issues with. This document has been design in such a manner that you will 'jump' between sections pertinent to fixing the previous one.

Discussion Forum

If your issue can not be solved here, please submit your question to the MOOSE Discussion forum for help!

Conda Issues

Conda issues can be the root cause for just about any issue on this page. Scroll through this section for what may look familiar, and follow those instructions:

  • 404 error, The channel is not accessible or is invalid.

    If you are receiving this, you may be victim of us changing the channel name out from underneath you (Sorry!). Remove the offending channel(s):

    
    conda config --remove channels https://mooseframework.org/conda/moose
    conda config --remove channels https://mooseframework.com/conda/moose
    conda config --remove channels https://mooseframework.inl.gov/conda/moose
    

    If you receive errors about a channel not present (CondaKeyError), please ignore. You most likely will not have all three 'old' channels. Next, add the correct channel:

    
    conda config --add channels https://conda.software.inl.gov/public
    

    When you're finished, a conda config --show channels should resemble the following:

    
    $ conda config --show channels
    channels:
      - https://conda.software.inl.gov/public
      - conda-forge
      - defaults
    

  • Download error, Timeouts

    Conda packages we produce can be quite large, and can trigger the default download timeout imposed upon by Conda's download routines. You can increase this time in the following way:

    
    conda config --set remote_read_timeout_secs new_timeout
    

    Where new_timeout is an integer greater than 60 (the default in seconds).

  • command not found: conda

    You have yet to install conda, or your path to it is incorrect or not set. You will need to recall how you installed conda. Our instructions ask to have Miniforge installed to your home directory: ~/miniforge. Which requires you to set your PATH accordingly:

    
    export PATH=~/miniforge/bin:$PATH
    

    With PATH set, try to run again, what ever command you were initially attempting.

  • conda activate moose

    If conda activate moose is failing like so:

    
    Run 'conda init --all' to be able to run conda activate/deactivate
    and start a new shell session. Or use conda to activate/deactivate.
    

    ...it's possible you have yet to perform a conda init --all.

    It could also mean you have an older version of Conda, or that the environment you are trying to activate is somewhere other than where conda thinks it should be, or simply missing / not yet created. Unfortunately, much of what can be diagnosed, is going to be beyond the scope of this document, and better left to the support teams at Conda. What we can attempt, is to create a new environment and go from there:

    
    conda create -n testing -q -y
    

    The above should create an empty environment. Try and activate it:

    
    conda activate testing
    

    If the command continues to ask you to perform a conda init --all or the command failed, or the conda create command before it, the error will likely be involved with how Conda was first installed (perhaps with sudo rights, or as another user). You should look into removing this installation of conda, and starting over with our Getting Started instructions. Failures of this nature can also mean your conda resource file (~/.condarc) is in bad shape. We have no way of diagnosing this in a troubleshooting fashion, as this file can contain more than just moose-related configs. For reference, the bare minimum should resemble the following:

    
    channels:
      - https://conda.software.inl.gov/public
      - conda-forge
      - defaults
    

  • conda init

    If conda init --all is failing, or similarly doing nothing, it is possible that Conda simply does not support the shell you are operating in. To figure out what shell you are operating in run the following:

    
    echo $0
    

    What ever returns here, is the type of shell you are operating in. Please verify this is a shell that Conda supports.

  • Your issue not listed

    The quick fix-attempt, is to delete the faulty environment and re-install it:

    conda activate base
    conda env remove -n moose
    conda create -n moose moose-dev=2024.03.15
    conda activate moose
    

    If the above re-install method ultimately failed, it is time to submit your errors to the discussion forum.

Build Issues

Build issues are commonly caused by an invalid environment, an update to your repository (leading to a mismatch between MOOSE and your application), or one of our MOOSE Conda packages being out of date.

  • Verify the Conda Environment is active and up to date, with the latest version of our moose packages:

    conda activate base
    conda env remove -n moose
    conda create -n moose moose-dev=2024.03.15
    conda activate moose
    

    if conda activate moose failed, see Conda Issues above.

    commentnote

    Whenever an update is performed in Conda, an update should also be performed on your MOOSE repository, and vice versa. It is important to keep both of these in sync.

  • Verify the MOOSE repository is up to date, with the correct vetted version of libMesh:

    warningwarning

    Be sure you have committed/saved your work. The following commands will delete untracked files!

    
    cd moose
    git checkout master
    git clean -xfd
    
    <output snipped>
    
    git fetch upstream
    git pull
    git submodule update --init
    

  • Verify you either have no moose directory set, or it is set correctly.

    
    [~] > echo $MOOSE_DIR
    
    [~] >
    

    The above should return nothing, or it should point to the correct moose repository.

    commentnote

    Most users, do not use or set MOOSE_DIR. If the above command returns something, and you are not sure why, just unset it:

    
    unset MOOSE_DIR
    

  • Try building a simple hello world example (there is more text than what is visible, be sure to copy it all):

    
    cd /tmp
    cat << EOF > hello.C
    #include <mpi.h>
    #include <stdio.h>
    
    int main(int argc, char** argv) {
      // Initialize the MPI environment
      MPI_Init(NULL, NULL);
    
      // Get the number of processes
      int world_size;
      MPI_Comm_size(MPI_COMM_WORLD, &world_size);
    
      // Get the rank of the process
      int world_rank;
      MPI_Comm_rank(MPI_COMM_WORLD, &world_rank);
    
      // Get the name of the processor
      char processor_name[MPI_MAX_PROCESSOR_NAME];
      int name_len;
      MPI_Get_processor_name(processor_name, &name_len);
    
      // Print off a hello world message
      printf("Hello world from processor %s, rank %d out of %d processors\n",
             processor_name, world_rank, world_size);
    
      // Finalize the MPI environment.
      MPI_Finalize();
    }
    EOF
    
    mpicxx -fopenmp hello.C
    

    If the above build fails, and you have the correct Conda environment loaded (conda activate moose), then something is failing beyond the scope of this document, and you should now contact us via the discussion forum.

    If the build was successful, attempt to execute the hello word example:

    
    mpiexec -n 4 /tmp/a.out
    

    You should receive a response similar to the following:

    
    Hello world from processor my_hostname, rank 0 out of 4 processors
    Hello world from processor my_hostname, rank 1 out of 4 processors
    Hello world from processor my_hostname, rank 3 out of 4 processors
    Hello world from processor my_hostname, rank 2 out of 4 processors
    

  • Sometimes a build will fail due to running out of memory. When a build fails in this way, it is not always apparent. The compiler will simply die while not explaining why:

    
    make -j 8
    ...<trimmed>
    Compiling C++ (in opt mode) moose/framework/contrib/exodiff/FileInfo.C...
    Compiling C++ (in opt mode) moose/framework/contrib/exodiff/stringx.C...
    Compiling C++ (in opt mode) moose/framework/contrib/exodiff/iqsort.C...
    (standard input): Assembler message:
    (standard input): 488982: Warning: end of file not at end of a line; newline inserted
    x86_64-conda-linux-gnu-c++: fatal error: Killed signal terminated program cc1plus
    compilation terminated.
    make: *** [moose/framework/build.mk:150: moose/test/build/unity_src/object.x86_64-conda-linux-gnu.opt.lo] Error 1
    make: *** Waiting for unfinished jobs....
    

    If you are receiving a similar result, try reducing how many cores you are compiling with (try make -j 4 instead). Each core consumes roughly 2GB of RAM (more if you are compiling in debug mode). Errors of this type are common among users who may be running on some form of virtual machine, or when operating within an HPC cluster with strict resource availability guidelines.

  • If all of the above has succeeded, you should attempt to rebuild MOOSE or your application again. If you've made it this far, and the above is working, but MOOSE fails to build, then it is time to ask us why on the discussion forum.

Failing Tests

If many, or all tests are failing, it is a good chance the fix is simple. Follow through these steps to narrow down the possible cause.

First, run a test that should always pass:


cd moose/test
make -j 8
./run_tests -i always_ok -p 2
commentnote:did make -j 8 fail?

If make -j 8 fails, please proceed to Build Issues above. This is most likely why all your tests are failing.

This test, proves the TestHarness is available. That libMesh is built, and the TestHarness has a working MOOSE framework available to it. Meaning, your test that is failing may be beyond the scope of this troubleshooting guide. However, do continue to read through the bolded situations below. If the error is not listed, please submit your failed test results to the MOOSE Discussion forum for help.

Some tests are SKIPPED. This is normal as some tests are specific to available resources, or some other constraint your machine does not satisfy. If you see failures, or you see MAX FAILURES, thats a problem! And it needs to be addressed before continuing:

  • Supply a report of the actual failure (scroll up a ways). For example the following snippet does not give the full picture (created with ./run_tests -i always_bad):

    
    Final Test Results:
    --------------------------------------------------------------------------------
    tests/test_harness.always_ok .................... FAILED (Application not found)
    tests/test_harness.always_bad .................................. FAILED (CODE 1)
    --------------------------------------------------------------------------------
    Ran 2 tests in 0.2 seconds. Average test time 0.0 seconds, maximum test time 0.0 seconds.
    0 passed, 0 skipped, 0 pending, 2 FAILED
    

    Instead, you need to scroll up and report the actual error:

    
    tests/test_harness.always_ok: Working Directory: /Users/me/projects/moose/test/tests/test_harness
    tests/test_harness.always_ok: Running command:
    tests/test_harness.always_ok:
    tests/test_harness.always_ok: ####################################################################
    tests/test_harness.always_ok: Tester failed, reason: Application not found
    tests/test_harness.always_ok:
    tests/test_harness.always_ok .................... FAILED (Application not found)
    tests/test_harness.always_bad: Working Directory: /Users/me/projects/moose/test/tests/test_harness
    tests/test_harness.always_bad: Running command: false
    tests/test_harness.always_bad:
    tests/test_harness.always_bad: ###################################################################
    tests/test_harness.always_bad: Tester failed, reason: CODE 1
    tests/test_harness.always_bad:
    tests/test_harness.always_bad .................................. FAILED (CODE 1)
    

If the test did fail, chances are your test and our test is failing for the same reason:

  • Environment Variables is somehow instructing the TestHarness to use improper paths. Try each of the following and re-run your test again. You may find you receive a different error each time. Simply continue troubleshooting using that new error, and work your way down. If the error is not listed here, then it is time to ask the MOOSE Discussion forum for help:

    • check if echo $METHOD returns anything. If it does, try unsetting it with unset METHOD

    • check if echo $MOOSE_DIR returns anything. If it does, try unsetting it with unset MOOSE_DIR

    • check if echo $PYTHONPATH returns anything. If it does, try unsetting it with unset PYTHONPATH

      commentnote:METHOD and MOOSE_DIR

      If these were set, it will be necessary to perform a rebuild. See Build Issues above.

  • Failed to import hit:

    • Verify you have activated the conda moose environment: echo $CONDA_DEFAULT_ENV. This command should return 'moose'. If not, see Conda Issues above.

  • Application not found

    • Your Application has not yet been built. You need to successfully perform a make. If make is failing, please see Build Issues above.

    • Perhaps you have specified invalid arguments to run_tests? See TestHarness More Options. Specifically for help with:

      • --opt

      • --dbg

      • --oprof

  • gethostbyname failed, localhost (errno 3)

    • This is a fairly common occurrence which happens when your internal network stack / route, is not correctly configured for the local loopback device. Thankfully, there is an easy fix:

      • Obtain your hostname:

        
        hostname
        
        mycoolname
        

      • Linux & Macintosh : Add the results of hostname to your /etc/hosts file. Like so:

        
        sudo vi /etc/hosts
        
        127.0.0.1  localhost
        
        # The following lines are desirable for IPv6 capable hosts
        ::1        localhost ip6-localhost ip6-loopback
        ff02::1    ip6-allnodes
        ff02::2    ip6-allrouters
        
        127.0.0.1  mycoolname  # <--- add this line to the end of your hosts file
        

        Everyones host file is different. But the results of adding the necessary line described above will be the same.

      • Macintosh only, 2nd method:

        
        sudo scutil --set HostName mycoolname
        

        We have received reports where the second method sometimes does not work.

  • TIMEOUT

    • If your tests fail due to timeout errors, it is most likely you have a good installation, but a slow machine (or slow filesystem). You can adjust the amount of time that the TestHarness allows a test to run before timing out, by adding a parameter to your test file:

      
      [Tests]
        [./timeout]
          type = RunApp
          input = my_input_file.i
          max_time = 300   <-- time in seconds before a timeout occurs. 300 is the default for all tests.
        [../]
      []
      

  • CRASH

    • A crash indicates the TestHarness executed your application (correctly), but then your application exited with a non-zero return code. See Build Issues above for a possible solution.

  • EXODIFF

    • An exodiff indicates the TestHarness executed your application, and your application exited correctly. However, the generated results differs from the supplied gold file. If this test passes on some machines, and fails on others, this would indicate you may have applied too tight a tolerance to the acceptable error values for that specific machine. We call this phenomena machine noise.

  • CSVDIFF

    • A different file format following the same error checking paradigm as an exodiff test.