Little Island Dev

API

If you are thinking of creating your own Little Island Client or another system that uses the Little Island Service, or if you are interested in understanding the client-server relationship, then the API Description is the place to start.

Versioning

At present the version number for both the server and client systems is based on the date of the last change. It will usually take this form, truncated from the right for brevity: version.year.month.day.increment. The current alpha version is 0 so the full version number may look like this: '0.09.02.17.2' or more generally '0.09.02'.

Visit the Source Code page for the latest version of the server scripts.

The latest version of the Little Island Greasemonkey Client is always available at this address: http://littleisland.ruinsofmorning.net/littleislandclient.user.js

Areas of Interest

Caching

The issue for caching is overhead. As of writing the caching system is extremely rudimentary. Whenever a comment is posted the caching routine checks for a sleeping cache operation. If there's none, then it spawns a child process and registers a running operation (basically just a postdate time stamp on the node). The child process sleeps for a few seconds to collect any following changes to the node. Upon waking, the child process pulls in a full set of comment data for that node (literally comment.* with left joins on the user table but excludes the content table containing the titles and bodies) and begins rebuilding all of the cache pages starting with page 1 and working through, re-paginating all the way to the last page (even if there are twenty pages and the last one contains nothing but threads that have not been touched in years). With any serious load this is going to become unworkable. Right now the only fix in place is a page limit on the cache which means that comments that fall outside the cache range will become generally unavailable.

What is needed is a caching system that does the minimum necessary to provide these results:

  • Most recently active threads in the first page.
  • All threads available at all times.
  • limits re-paginations to those pages affected by updates (bumping an old thread should only affect the first page and the page that the thread was bumped from).
  • Maintains or returns a reasonable number of comments for each cache request where the content is available.

As of writing the caching system is maintaining a 'cachemap' table that locates threads in the cache, how many comments they contain, and the UCID of the last comment. At present this data is not used, and may even be too verbose to provide a basis for developing performance caching.

Node Proximity

One of the core issues for the Little Island system is identifying pages in a useful manner. On the one hand users should have as much freedom as possible to comment on web content. On the other it can often be difficult to correctly identify the content they are commenting on. For instance, a user may comment on a YouTube video at the url:

http://www.youtube.com/watch?v=TZ860P4iTaM

But the same video may also be accessed using a longer query string:

http://www.youtube.com/watch?v=TZ860P4iTaM&feature=related.

In these cases the matter of identifying a node in proximity might be as simple as finding Query A as a substring in Query B or the reverse. The current url parsing system eliminates fractions and query key=value pairs with null values, but it cannot judge whether or not changing the url changes the content of page. There's also the possibility of server session IDs and other junk data being stored in the url, or mod_rewrite being used to allow query data to look like a path. The end result is an obvious potential for commenters on the same content to 'miss' each other by using slightly different urls. Additionally the server will end up collecting a library of single-use and duplicate nodes linking single comments that will never be read because the chances of someone else getting the same set of query values in their request are staggeringly small.

On the plus side, the use of unique comment identifiers means that, when the need arises, nodes can be merged with a very simple database update and without the need to worry about collisions.

In many cases there will be no simple way to identify matching content with different urls. However, there should be some way to indicate to a commenter that there may be other nodes with other commenters nearby where they are more likely to find an audience.

Simply put, there are trillions of possible urls, each of them a potential node. Users are going to need tools to help them navigate to the "hot spot" nodes where commenting is taking place. The most important tool would be a client accessed system that does two things:

  • Makes suggestions for alternate addresses based on the suitability of the current url.
  • Offers alternate nodes in proximity that may contain the same content but already have commenters.
Firefox Client Add-on

The original intention was to create the Little Island Client as an add-on extension for the Firefox browser. The primary reason for developing the client as a Greasemonkey user script instead was speed of development. But an Add-on client is still the primary goal.

Ideally, the client should integrate as much as possible with the target object (usually a web page), while at the same time providing simple, consistent and comfortable access for reading and posting comments. It also needs to be available for non-html objects such as individual images (something the current client cannot do).

The ability to cache pages of comments locally would both speed up user access and reduce load on the server.

The Greasemonkey user script for the client is almost 100K as of version 0.09.02 and is most likely pushing the limits of usability on some older machines. Hopefully, building the client as an extension will offer opportunities for providing a much more efficient and portable system.

This project welcomes alternative independantly developed clients and other systems that access the comment data in a constructive manner. If you are interested in devloping a client then the place to start is the API Description.

License

The Little Island server and client are released under a GNU General Public License version 3.

Additionally, the client accesses a modifies version of Paul Johnston's JavaScript implementation of the RSA Data Security, Inc. MD5 Message Digest Algorithm which is released under a BSD license. The source code for the original script is available via the project home page and a copy is made available here. Paul Johnston in no way endorses Project Little Island nor is he personally affiliated with the project in any capacity.

At some point an independent implementation of the MD5 Algorithm may be developed to leave the project 100% GPL.