Set up the Python environment for triplet embeddings
setup_python_env.RdCall this function once the very first time you use the embedding pipeline. It will:
Check whether Miniconda/Anaconda is available on the system.
Create a self-contained conda environment named
envname.Install all required Python packages listed in
requirements.txtinto that environment.Activate the environment for the current R session.
Arguments
- envname
Name of the conda environment to create. Defaults to
"triplet-embeddings". Change this only if you need to keep multiple isolated environments on the same machine.- requirements
Path to a
requirements.txtfile listing the Python packages to install. Defaults to the copy bundled with the package (inst/requirements.txt).
Details
On future R sessions you do not need to call this function again.
Loading the package with library() is sufficient — the environment
is detected and activated automatically at that point.
Python dependencies
The following packages are installed into the conda environment:
numpy, pandas, torch, scikit-learn,
scipy, and skorch. PyTorch is installed via conda from the
pytorch channel; all other packages come from conda-forge. No pip installs
are used, which ensures DLL compatibility on Windows. PyTorch is a large
download (~300–800 MB depending on platform), so the first-time
installation may take several minutes.
Examples
if (FALSE) { # \dontrun{
# Run once after installing the package:
setup_python_env()
# On all subsequent sessions just load the package as normal:
library(tripletTools)
results <- run_embeddings(
input_file = "triplets.csv",
additional_data_file = "item_labels.csv",
output_dir = "embeddings_output"
)
} # }