0. Set up your environment
In this notebook I will
- Suggest how to set up your environment. You might find the checks below useful to confirm that your environment has been set up correctly for the following notebooks to run.
- Explain where the code expects to find data and where it saves media files by default.
Model files are not saved by these notebooks. Models are re-run for each notebook, so the notebooks will work if run in any order.
See also the Notebooks README in this folder for information about how to set the project_root
variable.
Set notebook to reload functions every time a cell is run
This is useful if you make any changes to any underlying code.
# Reload functions every time
%load_ext autoreload
%autoreload 2
Check that the patientflow package has been installed
try:
import patientflow
print(f"✓ patientflow {patientflow.__version__} imported successfully")
except ImportError:
print("❌ patientflow not found - please install using one of the following methods:")
print(" From PyPI: pip install patientflow")
print(" For development: pip install -e '.[test]'")
except Exception as e:
print(f"❌ Error: {e}")
✓ patientflow 0.2.0 imported successfully
Set project_root
variable
The variable called project_root
tells the notebooks where the patientflow repository resides on your computer. All paths in the notebooks are set relative to project_root
. There are various ways to set it, which are described in the notebooks README.
from patientflow.load import set_project_root
project_root = set_project_root()
Inferred project root: /Users/zellaking/Repos/patientflow
Set file paths
Now that you have set the project root, you can specify where the data will be loaded from, where images and models are saved, and where to load the config file from. By default, a function called set_file_paths()
sets these as shown here.
# Basic checks
print(f"patientflow version: {patientflow.__version__}")
print(f"Repository root: {project_root}")
# Verify data access
data_folder_name = 'data-synthetic'
data_file_path = project_root / data_folder_name
if data_file_path.exists():
print("✓ Synthetic data found")
else:
print("Synthetic data not found - check repository structure")
patientflow version: 0.2.0
Repository root: /Users/zellaking/Repos/patientflow
✓ Synthetic data found
Theset_file_paths
function will set file paths to default values within the patientflow
folder, as shown below. File paths for saving media and models are derived from the name of the data folder.
In the notebooks that follow, no trained models are saved by default. All notebooks load data from data_file_path
and train models from scratch. However, you may want to make use of model_file_path
to save a model locally, especially they are time-consuming to run in your environment.
The config.yaml file will be loaded from the root directory. It specifies training, validation and test set dates, and some other parameters that will be discussed later.
from patientflow.load import set_file_paths
data_file_path, media_file_path, model_file_path, config_path = set_file_paths(project_root,
data_folder_name=data_folder_name)
Configuration will be loaded from: /Users/zellaking/Repos/patientflow/config.yaml
Data files will be loaded from: /Users/zellaking/Repos/patientflow/data-synthetic
Trained models will be saved to: /Users/zellaking/Repos/patientflow/trained-models/synthetic
Images will be saved to: /Users/zellaking/Repos/patientflow/trained-models/synthetic/media
Summary
In this notebook I have shown:
- How to configure your environment to run these notebooks
- Where the notebooks expect to find data, and where they will save media file, by default