2017-05-23 18:35:57 - Atualizado em 2017-05-24 19:48:42

Harpia - Hierarchical Classification Framework

I'm carrying out some structural modifications on the source code and will soon give public access to everyone interested in contributing to its development. The source code will be available on GitHub repository.

Is an open-source Java library for the development of machine learning algorithms that learn from hierarchically-labeled examples, named after the Harpia harpyja eagle (picture below). 

The largest and most powerful raptor found in the Americas, and among the largest extant species of eagles in the world. It usually inhabits tropical lowland rainforests in the upper (emergent) canopy layer, such as Amazon Forest in Brazil. Find more about Harpia Eagle

The task of learning from hierarchically-labeled examples consist of building a classifier capable of categorizing examples into a set of classes consistent with a predefined taxonomy organized in a hierarchical fashion (Figure 1). In other words, the classes in the taxonomy relate to each other by means of generalization and specialization relationships. Nowadays, a growing number of applications are specially suited for the task of classifying objects into hierarchical structures. This is rather the nature of many real world problems such as music genre classification, semantic annotation of images and video, web page categorization and functional genomics, to mention just a few.



Figure 1: Glass taxonomy example. 

The implementation of this framework is based on the Weka Machine Learning API. Currently, the library includes only one ''Local method with top-down class-prediction strategy'', which we call Hierarchical Binary Relevance, or simply HBR. However, we are working on the development of new algorithms and, soon in future, we will make them available. 

It is already implemented the hierarchical classification evaluation module with a variety of evaluation measures related to three categories: example based, label based and level based. Nonetheless, the micro and macro average measures are also available. 

Yet it is already functional, there is no graphical user interface (GUI). The possibility to use the library via command line, is also currently not supported. The Getting Started page and the complete Documentation section are in revision stage and will be soon available. 

(see the comment on the top of the page) By now, if you want to use it, please contact me and ask a copy. 

Any contribution is welcome! 

Citation: Jean Metz. Abordagens para aprendizado semissupervisionado multirrótulo e hierárquico. Instituto de Ciências Matemáticas e de Computação, Universidade de São Paulo, 2011. Tese de doutorado.

Atenção! Conteúdo original hospedado em: http://www.jean.metzz.org/harpia-project.