Fulltext search on courses databases

From Dokeos

Jump to: navigation, search

Contents

What is it?

The fulltext search feature was first seen in MySQL version 3.23.23. It is a functionality that allows the database user (the Dokeos developer in this case) to search the database for a specific word, and get a ranking on this word for every element it searches in. As one of the Dokeos clients requested it, I (Yannick) have implemented it on their system and it works nicely, but brings several dependencies which I will introduce in the next section.

The aim of all this is to authorise visitors on a e-learning portal to search for a course by topic. The topic word (minimum four letters) is entered in a search box and the system gives a list of courses, ordered by relevancy.

Sadly, drawbacks are numerous and if this feature is integrated into Dokeos, it should be optional and activated if required at installation.

A new opportunity appearing now is to use mnoGoSearch which seems to be made in a good way to support multiple databases as well as multiple languages. This is now on my roadmap but a lot of tuning must be done and it is not planned for next week...

Advantages

  • This feature, as described, enables a user to search for a course by topic. Not only on the title, but on each field that has been searched for. So for example, we could search in all courses descriptions and get the best results from there.
  • The feature is already developed into MySQL, so there is not big work for the programmer.

Drawbacks

  • It is higly database dependant. This means one more step in the -no database system change- direction
  • It is language based. Stopwords exist for the language the database was set up for. This means there needs to be one language for each database, depending on the language of the course. I am not sure about this, nor am I sure that it is possible for MySQL to have different languages for each database...
  • It needs to add an index on the database fields you want to... index...
  • It requires at least three possible fields, or it will not return any result (so if I want to search the descriptions of a course, I need at least three descriptions). What I do to search course_descriptions now is create a temporary table for each course_description table, with one or more fields being the originals and two "empty" fields being generated just for the search. This is of course a huuuuuge overhead for the search.
  • Searches in MySQL versions prior to 4.1 (check that) cannot search on terms smaller than 4 characters. This is particularly annoying for courses as you might, for example, search for PHP... Even versions after 4.1 expect you to change a setting in the MySQL config file...

So this fulltext search is definitely not optimized. But it is useful.

Code

The code is rather simple, except for the temporary table that needs to be created...

I added a homepage plugin functionality which enables to enter a text and be sent to the search results page.

Apart from that, the database needs to be checked for (real) courses and then all courses databases need to be searched with the special fulltext query:

$query = "SELECT *, MATCH(`description`) AGAINST ('$search_term') "
        ."FROM `$course_database`.`course_description` ORDER BY 2 DESC";

Then every result must be brought into a general resulting array, and this array must finally be sorted.

Conclusion

I think this plugin should be discussed further and only be part of the next Dokeos release as an optional feature. Also, this could be extended to documents indexing, but that's another step further...

See also

Personal tools