Personal tools

A SVM Classifier for Large-Scale Biological Applications

Support Vector Machines (SVM) are a supervised classification method largely used in bioinformatics. As for today (February 2007), Google searching with keys SVM + Support Vector Machine + Biology provides more than 1 million results.

Possible applications in bioinformatics include:

  • diagnosis by gene expression profile,
  • discriminant analysis of microarray data,
  • subcellular localization of proteins,
  • analysis of regulatory regions,

as well as prediction of protein-protein interaction, protein structure, protein function, and protein stability.


Our implementation of SVM is aimed to large scale data sets, therefore it is based on parallel computing and highly optimized linear algebra kernels for cache-based architectures. We implemented an MPI based algorithm for distributed memory machines with effective parallel I/O and high performance matrix-matrix product.

Benchmarks on a medium scale biomedical dataset show performance around 86% of theoretical peak and parallel efficiency around 80% for 16 CPUs, with improving trend for larger datasets.

For further information on this software tool and on how to have access to this resource, please contact Fabio Maggio at the CRS4 Bioinformatics Lab, maggio@crs4.it

If you are interested in finding out examples of applications of SVM to bioinformatics, you may have a look here: Recent SVM applications to bioinformatics: a very partial bibliography

Powered by Plone, the Open Source Content Management System