Basic Example of Using the Self-Organizing Map (SOM) Class¶
This script provides a straightforward example of how to use the SOM class to train and evaluate a Self-Organizing Map (SOM) with predetermined hyperparameters. The process includes calculating fit metrics and visualizing the results.
Key Features:¶
Predetermined Hyperparameters:
The SOM is configured with a fixed set of hyperparameters, including scaling method, grid dimensions, topology, neighborhood function, and number of epochs.
Fit Metric Calculation:
The Percent Variance Explained (PVE) and Topographic Error are calculated to evaluate the quality of the SOM fit.
Visualization:
Component planes for individual features are generated to show their distribution across the SOM grid.
Categorical data distributions are visualized on the SOM grid to examine relationships between numerical and categorical variables.
Considerations:¶
This script serves as a basic example and is not intended for rigorous analysis or hyperparameter tuning. For more complex workflows, such as automated hyperparameter optimization, refer to scripts specifically designed for grid search or other tuning techniques.
1. Imports¶
[1]:
# Standard imports
import os
import sys
# Third party imports
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
# Local imports
notebook_dir = os.getcwd()
parent_dir = os.path.abspath(os.path.join(notebook_dir, '..', '..'))
sys.path.append(parent_dir)
from SOM.utils.som_utils import SOM
2. Load Data¶
[2]:
# Load data
train_dat_path = os.path.join(parent_dir, 'SOM', 'data', 'iris_training_data.csv')
other_dat_path = os.path.join(parent_dir, 'SOM', 'data', 'iris_categorical_data.csv')
train_dat = pd.read_csv(train_dat_path)
other_dat = pd.read_csv(other_dat_path)
3. Train SOM¶
[3]:
# Train SOM
som = SOM(
train_dat=train_dat,
other_dat=other_dat,
scale_method="minmax",
x_dim=5,
y_dim=2,
topology="hexagonal",
neighborhood_fnc="gaussian",
epochs=17,
)
# Train SOM Map
som.train_map()
4. Get Fit Metrics¶
[4]:
# Get fit metrics
pve = som.calculate_percent_variance_explained()
topographic_error = som.calculate_topographic_error()
print(f"Percent variance explained = {pve}%")
print(f"Topographic error = {topographic_error}")
Percent variance explained = 92.39063619063629%
Topographic error = 0.5533333333333333
5. Plot SOM’s (see output directory for figures)¶
[ ]:
# Plot component planes
som.plot_component_planes(output_dir="output/iris")
# Plot SOM Map Using Categorical Data
som.plot_categorical_data(output_dir="output/iris")
[ ]: