My mpi4py (3.1.5) installation with openmpi (4.1.4) on python3.8 and ubuntu 20.04 has randomly stopped working today. Whenever I execute anything that loads mpi4py
in python, I get the following error:
[juanMS:15643] [[INVALID],INVALID] ORTE_ERROR_LOG: A system-required executable either could not be found or was not executable by this user in file ess_singleton_module.c at line 572[juanMS:15643] [[INVALID],INVALID] ORTE_ERROR_LOG: A system-required executable either could not be found or was not executable by this user in file ess_singleton_module.c at line 172--------------------------------------------------------------------------It looks like orte_init failed for some reason; your parallel process islikely to abort. There are many reasons that a parallel process canfail during orte_init; some of which are due to configuration orenvironment problems. This failure appears to be an internal failure;here's some additional information (which may only be relevant to anOpen MPI developer): orte_ess_init failed --> Returned value A system-required executable either could not be found or was not executable by this user (-126) instead of ORTE_SUCCESS----------------------------------------------------------------------------------------------------------------------------------------------------It looks like MPI_INIT failed for some reason; your parallel process islikely to abort. There are many reasons that a parallel process canfail during MPI_INIT; some of which are due to configuration or environmentproblems. This failure appears to be an internal failure; here's someadditional information (which may only be relevant to an Open MPIdeveloper): ompi_mpi_init: ompi_rte_init failed --> Returned "A system-required executable either could not be found or was not executable by this user" (-126) instead of "Success" (0)--------------------------------------------------------------------------*** An error occurred in MPI_Init_thread*** on a NULL communicator*** MPI_ERRORS_ARE_FATAL (processes in this communicator will now abort,*** and potentially your MPI job)[juanMS:15643] Local abort before MPI_INIT completed completed successfully, but am not able to aggregate error messages, and not able to guarantee that all other processes were killed!
This is really frustrating because I did not make any system changes or package updates or anything like that. I have tried removing all openmpi packages on my system and python venv:
sudo apt purge --autoremove libopenmpi-dev libopenmpi3 mpich openmpi-bin openmpi-commonpip uninstall mpi4py
I have tried this multiple times and for some reason the same error keeps popping up. There is nothing wrong that I can see with my openmpi version, as a simple test like this works fine:
mpirun -np 4 hostname
I have found virtually no help online, so I'm hoping someone here can guide me in the right direction!