Project
Taxonomy Extension – Project Proposal
8 years, 1 month ago Posted in: Project 2

! is a powerful Content Management System (CMS) and a popular publishing framework known for its aesthetics and Web Application Development. However it is limited by the simple section/category/article hierarchy for organizing content.

The project intends to address a greatly missing feature of Joomla! that is a well structured Component (API, frontend and extensions / modules) to provide proper organization of content through Taxonomies and Tagging. Almost all content needs organization for easy access and a well-designed Component increases the accessibility and usability of the site dramatically.

Personal Details

Name: Gartheeban Ganeshapillai
Email: garthee@theebgar.net
GTalk: garthee@gmail.com
Skype ID: garthee

Project Proposal

1. Introduction

A robust Taxonomy Component will eventually supersede similar but inferior implementations such as keywords, groups and other classifications, and eventually replace them with greater functionality. Other foreseeable areas of applications are

  1. Replacement of section / category with Taxonomy
  2. Use of Taxonomy for any such classification required by creating new Taxonomy trees for different applications
  3. Powerful search functionality, also known as faceted search
  4. Dynamic client side filtering, an innovative feature provided by Exhibit [1]

In bird’s view, a Taxonomy Component includes the following components

  1. API (underlying layer for frontend modules and extensions, and backend management forms)
  2. Frontend modules
    a. Sidebars – for related content based on Terms attached, Tagcloud, Cumulus [2], etc
    b. Form integration for content creation (through extensions)
  3. Backend Management forms
  4. Support for other 3rd party extensions (such as exhibit to provide dynamic faceted presentation, Opencalais for auto tagging, etc)

Also in design the taxonomy system include

  1. Hierarchically organized terms also known as taxonomy tree
  2. Flexible free terms also known as tagging

I propose the development of a Taxonomy Component that is scalable and robust so that it can be integrated into the core and extensible to cater future needs.

2. Based on the proposed idea:

Taxonomy Extension :“Create a 1.6 Taxonomy extension with the goal that this work could be integrated into the core Joomla CMS for future releases. The Taxonomy Extension will allow the ability to organise content for classification, improving on the current Section and Categories classification….” [3]

Expected Mentor : Allan Walker

3. Benefits to Joomla!:

A site’s popularity depends on not only its content but also the accessibility and usability. People tend to spend very little time looking for an item in a site, even if they are confident about the existence of the item they are looking for. A fine organization of content, will enable users (when searching) to obtain the most relevant information they are looking for by

  1. allowing content creators to organize content
  2. allowing admins to guide content creators in organizing content by defining protocols such as taxonomy trees, related items, etc
  3. allowing users to access content easily by providing interfaces that expose the classification system
  4. allowing the extensibility of the system
    • in creating terms, so that possibly other plugins or 3rd party tools can assist (this is in view with allowing features like auto tagging in future)
    • in presenting the UI to the users, so that users can choose different formats to expose their classification system (tag clouds, taxonomy trees, cumulus tag clouds, exhibit integrations)
    • in managing the classification system, so that admin controls can be enhanced to address future management needs.

4. Project Details:

Project consists of the following phases.

1. API

This section deals with the database and provides methods to create taxonomy trees, managing them, etc. The intended structure allows greater flexibility and organization such that the following functions are made inherently available

  1. Multiple select
  2. Free tagging
  3. Hierarchical representation
  4. Relationship between terms (such as similar term)
  5. Separate trees for disparate classification

This can better be explained through the following table structures, where 3NF is maintained (jos_taxonomy_relationship can be further normalized)

  • jos_taxonomy_tree : |id|name|description| other fields … |
  • jos_taxonomy_leave : |id|tree_id |name|description| other fields … |
  • jos_taxonomy_relationship : |id1| id2| type_of_relationship (can be parent, similar, etc)|
  • jos_taxonomy_mapping : |term_id|content_id| other fields .. |

Further considering the scalability in mind, for appropriate functions, the generated results from a complex query such as a whole taxonomy tree, will be kept in static variables so that repeated calls are answered without affecting the performance. Such measures will be essential, when Taxonomy is integrated to core and provided as an answer to virtually every classification requirements.

2. Front end modules

The first objective of this is to expose the taxonomy classification to users. For example, we could provide tag clouds (or fancy forms of it such as Cumulus), taxonomy browsing (as an alternative to menus) and other features. I would like to restrict it for Tag clouds, Related Content, and Taxonomy browsing (optional) for the scope of . However, with a flexible, extensible API, it would not be hard to extend as we see in future.

The second focus of this area to let the content creators create or apply taxonomy terms to the content (the mapping is performed at this stage). This can be extended to provide
• Autocomplete of terms as you type
• Suggestions of terms (based on the content body)
• Auto tagging (using third party tools)

I intend to provide autocomplete feature under the scope of GSoC. However, I would love to see Opencalais integration in near future too, as it will be hugely helpful in a community site where all users cannot be burdened / entrusted to submit corret terms, if they submit at all and auto tagging is essential.

3. Backend Management forms

In addition to general settings, Taxonomy Component will require many administrative configurations such as creating a new Taxonomy tree, determining who can add new terms while creating content, if it can be added all, etc. Further administrators might also want to perform actions affection masses such as tagging content in bulk, renaming or remapping tags, etc.

I would like to cover as much as possible while certainly completely general management tasks within GSoC. This task is to be clearly defined upon the discussion with the mentor.

4. Integration with 3rd party tools to enhance Taxonomy Component

First is the integration with Exhibit, a dynamic faceted browsing tool [4] from Haystack group, MIT [5] that uses many factors including taxonomy to filter and present content in real time. In addition, we could use Cumulus for tagcloud and Opencalais for autotagging.

While emphasizing the fact that a well-structured design can be extended easily to accommodate many novel features and tools around Taxonomy system, I would like to focus mainly on the well-designed backend within the scope of GSoC.

5. Background and foundation:

I understand the development of this particular module requires fluency in fouor basic areas that are Joomla!, PHP/MYSQL, Taxonomy systems and Tools around Taxonomy system. I am well experienced with the latter three and I am confident mentors and community will complement my lack of experience in Joomla!. If the proposal is accepted, I intend to use the community bonding time to learn the design principles of Joomla! (mainly from books [6]) and to finalize the design framework with the help of the mentors and community.

This proposal is highly influenced by the experience I have gained in the development of modules around Drupal’s taxonomy framework and my work with Exhibit tool to facilitate “Faceted Search” and “Faceted Browsing”. I invite you to look at the implementations of them in my sites (few are customized by me to suit my requirements, however written for a different platform)

  1. Faceted Search [4] : http://theebgar.net/all/results
  2. Exhibit in action (with timeline) [5] : http://theebgar.com
  3. In addition, I have worked on a similar project enabling dynamic viewing of filtering which can be found at
  4. http://old.theebgar.net/history – where with extensive Ajax use real time filtering of content is provided regardless the amount of content available in the site.

6. Risks

The main risk I could see is any inherent limitations in achieving the aforementioned objective due to the structure of the Joomla! that I am highly confident that the community members and mentors will be able foresee more easily than I do.

Considering the tight deadlines and conditions involved with the client projects I have worked with in the past and the amount of free time I going to have between graduation and enrollment at the Grad school I hardly see a possible conflict in time management.

Roadmap

This Component will address the aforementioned needs in its own, independently and modularly as explained above.

1. Deliverables:

  1. API
  2. Frontend modules for sidebar and forms
  3. Backend Management Forms
  4. 3rd party integration – Exhibit, Cumulus

and OpenCalais (if time permits)

2. Project Schedule :

I have already laid the foundation of the project and expecting to start working immediately if the proposal is accepted

• First week of May – Learn the internals of Joomla!, how the system works, coding conventions, and design patterns
• Mid of May – Complete the design of framework and API, after discussing with mentors
• End of May – Complete the basic development of the API
• Mid of June – Complete the testing and review on the API and Management Forms
• End of June – Complete the development Content creation form integrations
— Mid term evaluations —
• Mid of July – Complete the development of modules for sidebars
• Mid of July – Integration with Exhibit, Cumulus and Opencalais
— Final evaluation — Bug fixing and Documentation

Bio

Open Source Development Experience

I have been in programming since my childhood, and in recent years I am trying to involve in major projects like Drupal, Audacity, WordPress and Joomla!

I took part in GSOC 2007 with Drupal, for ULINK project [7] and developed ULINK module to generalize filtering and UAUTO to auto complete links through popup with suggestions. The latest releases with demo can be found at [8a] and I released a restructured version [8b] for Drupal 6 in February 2008.

I am a freelancer in open source web development and an Electronic and Telecommunication Engineer, and worked with several client projects [9] in this context.

Work/Internship Experience

I did my internship at Motorola [10] from 2007 September to 2008 April, mainly working on power management units, where I focused on writing applications and components of drivers to automate testing. During my latter part of the internship, I was researching the potential of improvising high contrast cameras from 2D barcode scanners for authenticating users based on palm print recognition. Further, I did fair amount of research on my own and published a paper post internship on using Principle Component Analysis (PCA) to classify multi-dimensional objects in unsupervised manner.

Since my participation in GSOC 2007, I have been involved with many client projects on web development and have worked with Figment SRL, Italy and Research Applications and Financial Tracking (RAFT) INC, USA [11]. The projects I have handled and modules developed are listed at [9].

Academic Experience

I am currently in the final months of my first degree in Electronic and Telecommunication Engineering [12a] at University of Moratuwa [12b] where I am currently ranked first in the faculty with a GPA of 4.15 in 4.2 scale.

I have been admitted to the PhD program at Massachusetts Institute of Technology (MIT) [13a] and will start working at the popular CSAIL [13b] (Where Exhibit and other popular Web2.0 tools were developed) from September 2009. My interest lies in the area of information retrieval and information management and in this regard, this particular project (taxonomy system) hugely attracts me. I will be graduating in 2009 April, and looking forward to spend the gap-time in Open Source Development.

Motivation

I have a strong background with PHP/MYSQL and I have been doing web development (since 2005) and with Drupal (since GSOC 2007) for long time. However, as I have become fully conversant with Drupal I am looking forward to work with similar publishing platforms to expand my horizons, and in this regard, I chose to work with Joomla!

Further, I understand this is a great opportunity for me to start a new road that is with Joomla! to put the knowledge and design skills I gave gained over the past few years, in practice.

References

[1] http://simile.mit.edu/wiki/Exhibit
[2] http://wordpress.org/extend/plugins/wp-cumulus/
[3] http://docs.joomla.org/Summer_of_Code_2009_Project_Ideas
[4] http://en.wikipedia.org/wiki/Faceted_browser
[5] http://groups.csail.mit.edu/haystack
[6] http://www.amazon.com/s/ref=nb_ss_gw?url=search-alias%3Daps&field-keywords=joomla&x=0&y=0
[7] http://drupal.org/project/ulink
[8a]http://project.theebgar.net/drupal-modules/ulink/ulink-52
[8b] http://project.theebgar.net/drupal-modules/ulink/ulink-61
[9] http://project.theebgar.net
[10] http://theebgar.net/all/results/taxonomy:18
[11] http://openbioraft.com
[12a] http://ent.mrt.ac.lk
[13b] http://mrt.ac.lk
[13a] http://mit.edu
[13b] http://www.csail.mit.edu/

Related Posts

2 Responses

  1. Barnali Roy Choudhury says:

    Will you provide me some extention features, category navigation or tree style navigation or faceted navigation of joomla or Drupal cms.

Leave a Reply