One of the classic problems when running a python script that imports another module is not being able to find the module.
ModuleNotFoundError: No module named 'xxx'
There are 2 fundamentals to remember.
- A python module is – simply – a file
- A Python package is a folder containing a
__init__.py
file. A folder of modules.
ModuleNotFoundError
The python docs describe a ModuleNotFoundError
as:
A subclass of ImportError which is raised by import when a module could not be located. It is also raised when None is found in sys.modules. – source
How does Python look for modules?
If the module is not a __builtin__
– that comes part of python’s standard library – then python will look for modules in sys.path
.
The python docs talk about the import path:
A list of locations that are searched by the path based finder for modules to import. During import, this list of locations usually comes from
sys.path
, but for subpackages it may also come from the parent package’s__path__
attribute. source
Viewing the Path
One suggestion is to see the paths where the ModuleNotFoundError
arises is to substitute the import with:
import sys
print(sys.path)
This will return a list of import paths:
['./projects/my_package/random_data',..., '/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/site-packages']
External Packages
Oftentimes when this error is seen – the package you expect is not on the path. This can often be because the virtual environment has not been activated or the package has not been installed (usually from pypi but it can be installed from elsewhere).
For example a script that expects the httpx
package on the path:
ModuleNotFoundError: No module named 'httpx'
This error can be fixed by using the correct python environment and installing httpx
:
python -m pip install httpx
Adding a Path
When the package or module is not an external module…
There are 3 methods to add a path:
- Use the
PYTHONPATH
environment variable - Insert the path at runtime
- Make your the package installable and install it
PYTHONPATH
PYTHONPATH
is an environment variable to augment / add to the default search import paths.
Usage:
PYTHONPATH=.. python3 my_file.py
or
PYTHONPATH=.. python3 -m my_file
This adds the parent folder to the path and python can now find the module it needs. This is good for a quick adhoc run.
Insert the path at runtime
Before you import xxxx
your module. Add the code:
import pathlib
current_dir = pathlib.Path(__file__).resolve().parent
sys.path.insert(0, str(current_dir.parent))
import xxxx
You can use
os.path.dirname(os.path.realpath(__file__))
if you do not want to usepathlib
This is deemed a hacky solution by some. Also a good quick fix.
Make the package installable
This one is longer winded. Python packaging is a topic on its own but boiling it down to the essentials it:
Moves your code into the python path – so that it can be referenced through the top-level package name.
To do this the directory structure has to be:
some-root-folder/
my_package/
my_top_level_module.py
my_sub_package/
__init__.py
my_sub_module.py
__init__.py
setup.py
In setup.py
add:
from setuptools import setup, find_packages
setup(
name='my_package', version='1.0', packages=find_packages()
)
Now we must install the package (in editable mode):
pip3 install -e .
You can then verify it is working with the python repl:
$ python3
>>> import my_package
>>> my_package.__name__
'my_package'
>>> my_package.__path__
_NamespacePath(['./projects/my-package/my_package', './projects/my-package/my_package'])
Hmmm….but now within the package itself – references are made with the top-level package name…
Is that right? Yes. Since you will be installing the package and then testing it – the package will be available on the path.
So there are questions to revisit:
- Should you refer to other modules in a python package with the package name?
- Where should tests live? Inside the module or outside?
1. Refering to Package Modules
To answer number these questions – I looked at 3 python packages I like: django
, httpx
and peewee
Django and peewee refer to other modules in the package with the package namespace.
In peewee/playhouse/dataset.py
:
from playhouse.db_url import connect
In django/db/backends/base/creation.py
:
from django.db import router
Httpx uses relative imports.
In httpx/_urls.py
:
from ._utils import primitive_value_to_str
2. Where Should Tests Live?
All 3 packages: django
, httpx
and peewee
have their test folder outside the package directory.
In django
– tests lived outside the package – in the root folder. This makes sense as the tests are not needed for the package to work and are useful at development and at build / integration time.
The instructions django gives for running tests:
cd tests
python -m pip install -e ..
python -m pip install -r requirements/py3.txt
./runtests.py
A lot of setup and path manipulation is done. Tests import the package they are testing from the package namespace as you would expect as they are not in the package folder.
Httpx uses pytest and normal discovery.
Peewee also has a specialised run_tests.py
.
Relative Imports
One opinion is you shouldn’t ever need relative imports.
You are free to choose to use them but they aren’t required and can make things difficult.