PyCCEA is an open-source package developed as part of ongoing doctoral research. It provides cooperative co-evolutionary strategies tailored for feature selection in large-scale and high-dimensional problems. The framework adopts a modular, decomposition-based approach and is intended for researchers and practitioners tackling complex feature selection tasks.
Note: PyCCEA is a work in progress. Stay tuned for improvements and new algorithm implementations.
To install the package directly from PyPI, use the following command:
pip install pyccea
Alternatively, if you want to install the latest version directly from the GitHub:
pip install git+https://github.com/pedbrgs/pyccea.git
Ensure you have pip
and an active internet connection to download dependencies.
This quickstart demonstrates how to use the CCFSRFG1 algorithm — a CCEA variant with random feature grouping — to perform feature selection on the Wisconsin Diagnostic Breast Cancer (WDBC) dataset.
In this example, you will:
- Load the dataset using the
DataLoader
utility. - Configure the dataset and algorithm from
.toml
files. - Run the optimization process.
import toml
import importlib.resources
from pyccea.coevolution import CCFSRFG1
from pyccea.utils.datasets import DataLoader
# Load dataset parameters
with importlib.resources.open_text("pyccea.parameters", "dataloader.toml") as toml_file:
data_conf = toml.load(toml_file)
# Initialize the DataLoader with the specified dataset and configuration
data = DataLoader(dataset="wdbc", conf=data_conf)
# Prepare the dataset for the algorithm (e.g., preprocessing, splitting)
data.get_ready()
# Load algorithm-specific parameters
with importlib.resources.open_text("pyccea.parameters", "ccfsrfg.toml") as toml_file:
ccea_conf = toml.load(toml_file)
# Initialize the cooperative co-evolutionary algorithm
ccea = CCFSRFG1(data=data, conf=ccea_conf, verbose=False)
# Start the optimization process
ccea.optimize()
The best feature subset found is stored in the attribute best_context_vector
, a binary array where 1 indicates a selected feature and 0 indicates an unselected one.
If you are using these codes in any way, please let them know your source:
@Misc{PyCCEA,
title = {PyCCEA: A Python package of cooperative co-evolutionary algorithms for feature selection in high-dimensional data},
author = {Pedro Vinicius A. B. Venancio},
howPublished = {\url{https://github.com/pedbrgs/PyCCEA}},
year = {2024}
}
Please send any bug reports, questions or suggestions directly in the repository.