All posts by doffm

Android resources crib notes

I don’t want to complain about Android development too much. It has its quirks, but generally its satisfactory. One area however that was totally opaque to me, and frustrating, is Android resources. I never saw them explained very well in tutorials. The documentation, when read through thoroughly, is very good. It was probably just too much for me at first.

So for everyones enjoyment I present my Android resources crib notes. Writing these down helped me get to grips with that pesky res directory.

Android Resources

Android resources are used to separate out non programatic elements of your application such as images, user-facing strings, colours and application layouts. This helps when writing applications for multiple devices and languages by separating out the presentation of the application.

Resource types

The res directory of an Android project holds all the application resources. Android is very restrictive about where resources can be placed within this directory, even to the point of refusing to build if there are any resources placed in the root res directory.

Android groups certain types of resources together and forces you to place these types in specific directories. This directory layout is very important. A table of the directories and the general types of resources allowed is found at the Providing Resources page of the Android documentation.

A directory doesn’t necessarily hold one single type of resource. The drawable directory for example can contain bitmap files, jpeg files and XML files. However, the structure of the XML files must be such that they describe a type of object that can be drawn to the screen. All resources in the drawable directory must be able to instantiate a Drawable class.

The values directory

One directory that may confuse is the values directory. This is because it is a grab-bag of different types of resources including colours, scalar values, strings and ‘styles’. The only common thing about these types is that they are described via XML files. Documentation on the different types and how they are described in XML is contained at Style Resources, String Resources and More Resource Types.

Accessing Resources

Within the program, resources are accessed via the R class. From there, access follows the directory structure. R.drawable contains all the resources in the drawable directory, R.layout contains all resources in the layout directory. The name of each resource is generally the same as the filename without the file extension. The resource res/drawable/blue_circle.xml would be accessed as R.drawable.blue_circle.

The values directory does not follow the normal scheme. There is no R.values package, instead the many different types of resource that can be placed in the values directory each have their own package. Some examples include R.dimen for dimensions, R.string for string values and R.color for colour values. Also different from the rest of the resources is the naming of the value resources. Instead of being named after the filename these resources are named directly in the XML. This allows many resources, even of different types to be placed in one XML file, the name of which has no effect. There are however strong conventions about the file names. All strings are usually placed in res/values/strings.xml and all colours res/values/colors.xml.

Suppose you had two string tags in your strings.xml with attributes name="app_name" and name="app_description". These two resources would then be accessed as R.string.app_name and R.string.app_description

One thing to keep in mind is that when using R.string.app_name, or anything inside the R package is that they are only references to the real values. R.drawable.blue_circle doesn’t contain a Drawable object but a reference that can be used in different locations within the Android API. R.drawable.blue_circle for example could be used to instantiate a Drawable class using getDrawable.

Accessing Resources from XML

Very often you will need to reference a resource not within code but within XML files. These XML files are generally other resources, but also include the AndroidManifest.xml file. The naming scheme for the XML reference follows the R package. If a resources is accessed as in code it is accessed as @type/name within XML.

Resource Namespaces

Resources are placed in Java packages. The canonical way to access a resource would be So assuming that the base package for your application is com.example.firstapp resources could be accessed as com.example.firstapp.R.string.app_name, or @com.example.firstapp:string/app_name in XML. The package name is in this case superfluous. One instance where you will often need to use the package name, is when you are accessing resources from the base Android platform. For this you would use or @android:type/name. I have most commonly used this for accessing Android styles and themes such as @android:style/TextAppearance.Large.

Alternative Resources

Android provides a very restricted workflow for creating applications that work with different screen sizes, resolution densities and languages. This is based on resources, and is again dependant on the naming of directories. While the drawable, values and layout directories could be considered the base resources, you are allowed to create directories that contain resources specific to specific device configurations. The configuration is determined using ‘qualifiers’. These are specific strings, placed after the base directory name, and separated by dashes.

For example, to create drawables that will be used only on high dpi screens a directory called drawable-hdpi should be created and populated. For french translation strings a directory called values-fr should be used. For drawables that should only be used in landscape mode on low resolution displays, a directory called drawable-land-ldpi will be useful. Any number of qualifiers can be used after the base name, but there seem to be some complicated rules on how the resources are then determined. See Providing Resources for details.


Android resources are a varied bunch. From simple bitmap files, to XML files with complicated schemas such as layouts. Generally the XML schemas for each type of resource are well documented, but there are lots of them. Just keep in mind that each resource directory generally holds a completely different type of object, and remember to look up the rules for the one that applies. The Providing Resources documentation is usually the best place to start.

The use of resources is as varied as the types. Methods taking resource references are to be found everywhere within the Android API, but a few of them, layout, strings and drawables are core. The use of these is included in every example android Application and tutorial.

Drag and Drop in Android 2.0

I’ve been working on Android projects for the past few weeks. For oneof them I wanted some drag-and-drop functionality that is sadlymissing from versions of Android prior to 3.0. I had some time on myhands so decided to create a library that mimics the Honeycombdrag-and-drop API.

As I couldn’t retrofit the drag-and-drop functionality in to the coreView class, I created a DragArea class to handle visualisation andevent dispatch for the drag operation. All Views wishing to receivedrag events, or start a drag must be children of a DragArea. TheDragArea class itself is based on a FrameLayout so is very easy toadd in to the widget hierarchy.

Starting a drag is very similar to the Honeycomb API. A View wishingto start a drag operation must have a reference to the DragArea andprovide some clip data to transfer along with an object used to drawthe drag visualisation.

Bundle data = new Bundle();data.putCharSequence("cliptext", "Some clip data goes in this bundle");dragArea.startDrag(data, new ViewDragShadowBuilder(this);

As methods for receiving drag events are not built in to the ViewAPI, views wishing to receive drag events must first register forthem. The DragEvent class itself is very similar to Honeycomb.

dragArea.addDragListener(this, new OnDragListener {
    public void onDrag(View view, DragEvent dragEvent) {        
        switch dragEvent.getAction() {
        case DragEvent.ACTION_DRAG_STARTED: break;
        case DragEvent.ACTION_DRAG_ENTERED: break;
        case DragEvent.ACTION_DRAG_EXITED: break;
        case DragEvent.ACTION_DROP: 
            Bundle data = dragEvent.getBundle();
            CharSequence dropText = data.getCharSequence("cliptext");
        case DragEvent.ACTION_DRAG_ENDED:
        default: break;

The code, as an android library project, is available ongithub as well as thedocumentation. A fullexample applicationthat makes use of the library is also available.

I know that this work becomes pointless when most android phones areupdated to Ice Cream, but until then it might be useful to someone whoneeds drag-and-drop on android 2.0. The similarity to the official APImight make it easier to port code in the future.

Mixed impressions of Google Appengine

I recently attempted to use Google Appengine to host my very simple personalwebsite.This was something of an experiment to learn about cloud hosting. These aremy initialthoughts on Google Appengine. They should be taken with a pinch of salt as Ihaven’t yethosted a high bandwidth, or high availability site or application.

Google Appengine is a software platformfor hostingweb based applications developed in either Java or Python.

The Good

  1. The SDK is super simple. It installed cleanly and easily, and contains adevelopmentweb server, database and sandboxed Python environment. Its genuinely apleasure.

  2. Setting up is easy. Signing up for Google Appengine and creating a firstapp isstraightforward. One doesn’t have to do anything dull like set-up Apache,MySQL,memcached or Ubuntu.

  3. The Python runtime provides flexibility. Google Appengine uses a fairlystandardPython 2.5 runtime with most of the batteries included. This made me feelfairly athome and means that you can use a number of Python web developmentlibraries.

  4. The Appengine is moderately Django friendly. Version 0.96 is available bydefaultbut more recent versions can be used instead. The documentation isexplicit abouthow to do this. There is an official Google project for making it easy toport Django applicationsand an alternative based on Django non-rel.

  5. The Appengine specific API’s are well documented and thought out. Thereare a fewproprietary API’s for task management, datastore access and mailing. TheonesI tried worked very well.

  6. The Admin console is awesome. It is easy to search and add data to thedatastore.There are good views of site traffic, and error logs. Its just generallyuseful.All without any setup.

The Bad

  1. Appengine isn’t Django. Its not even close. The important part is thatGoogleAppengine doesn’t provide any SQL database. This makes Django modelsincompatiblewith Appengine. Any existing Django application will require a decentamount ofported code to work properly. Djangoappengine from allbuttonspressed issupposedto get around this by providing a Django Model implementation on-top ofGoogleBigtable. For me it wasn’t without its flaws and probably isn’tproduction ready.

  2. Domain setup is ugly. To set up domain hosting for Appengine you need touse aGoogle Appsaccount.This is confusing enough but while the Appengine Admin site is a joy theAppsconsole is clunky confusing and poor. It sullies an otherwise greatexperience.Besides, these are two fairly separate products, one is called GoogleAppengineand the other Google Apps. What went wrong in the naming department?

  3. Google Appengine has no support for naked urls. These are of the form ‘’.This is related to integration with Google Apps. Apps are deployed as an‘App’ on your Google Apps domain. This might be a dealbreaker for some.Forme it was just a great annoyance.

The Horrible

  1. Google Appengine does not provide a SQL server and its Bigtable datastore isa proprietary Google interface. This means that once your site is writtenanddeployed you may be locked-in to Googles servers for a very long time.


I have mixed feelings about Google Appengine. On the one hand it has servedto convinceme that cloud hosting is probably the best bet for future applicationdeployment. On theother hand I don’t believe I would choose Google to do so.

It probably isn’t suited for people who want to base their business aroundthe success of a large web application. More flexible hosting such as thatprovided by Rackspace and Amazonis a better bet. On the other hand its lack of an SQL server makes it fairlyuseless for existing Django or Java application deployment.

Its closely linked with Google Apps, probably theApps Marketplace, and hasAPI’s for using Google Accounts for authentication. If you want to hitchyourbusiness wagon eternally to Google it might be a good idea.

Perhaps the best use of Google Appengine would be for development of newbusiness centric, or intranet applications. There is some sort of by-userpricing available that might be great. The ease of setup and developmentwouldbe a help and the proprietary nature of the thing might not besuch a party killer.

Waiting for my FacePhone

This week Techcrunch caused a stir by reporting that Facebook are buildinga phone.This has been flatly denied by Facebook, so we’ll assume for the moment thatthe reporting was erroneous and no such phone is on the way. I’m desperatelydisappointed. To repeat a joke first made by Nat Friedman: I think I’m goingto mail $1000 to Palo Alto with a note, and sit waiting for my FacePhone.Here is why Facebook should be building a phone.

A phone is social

Nokia has apparently used its “Connecting People” advertising slogan since1992, early recognition that phones are essentially social tools. There hasbeen a huge amount bolted to phones since that time. They have become‘convergant’ devices, an amalgam of your laptop, camera, walkman, GPS andGameboy. There are some very obvious synergies between these functionalmodes and the core social networking purpose of a phone, but I see them asessentially separate. The camera is used in a pinch, the internet browserfor info on-the-run, and gaming for entertainment on-the-bog. Phones are allabout connecting people and they always will be. Unfortunately this is theaspect I believe has seen the smallest real innovation so-far in thesmartphone era.

Despite the rise and rise of Facebook the 160 char text message is still themost widely used social application with 4.1 billion text messages sentdaily. I don’t wishto belittle Facebook’s achievements here, 1 billion daily messages is truly a phenomenon.

Integration is poor

The feature most-likely to impress people when shown my Nexus One is thattheir Facebook profile photo shows up in the contact list. In-fact theFacebook integration is probably one of the best features of Android.However integration doesn’t run very deep, and in many ways is clearlybroken. Merging of contacts is somewhat hit and miss. I’ve yet to work outwhen a contact gets merged and when it is duplicated. Facebook chat doesn’twork, or at least not in the default settings. While I’m able to keep intouch with a few of my friends through Google chat, this isn’t really wheremy canonical contact list lives. Part of the contact list problem is that sofew people leave their phone numbers with Facebook, forcing me to keep twolists. A Facebook phone would surely go some-way to alleviate this issue byencouraging users to tie their Facebook and Phone accounts together.

All of the above issues could I’m sure be solved through a better, moreintegrated Facebook app experience but they will constantly be fightingagainst the vested interests closer to the consumer. Google especially hasgood reason to keep the Facebook experience poor and push its GMail / GoogleTalk experience as default on Android. The carriers would prefer consumersto continue using text messages. Although Apple doesn’t have any seriouscompetitive areas with Facebook, I doubt whether the real ownership of userexperience required for deep integration is something they wouldcountenance.

There is a golden user experience waiting to be built here. One where‘Blocking’ someone on Facebook also causes their calls to be screened.Profile pictures of long-lost friends will be shown with every phone call.Making friends could be as simple as bumping phones togetherand give you access to numbers, email and Facebook feeds. None of theexisting phone OS’s have gone far enough and the number of players involvedmakes me feel it could be a long slog.

Location means revenue

I have no idea how Facebook are doing in building their revenue stream butownership of the mobile experience is surely one way to give it a massiveboost. Facebook are the kings of social and their recent addition of‘Places’ shows that they are serious about competing with Foursquare andYelp when it comes to location. Currentlythe Foursquare experience is as integrated in-to smart phones as Places,probably more so. A better integrated experience means more check-ins.

Google might have created a huge advantage here. If they can find a way toleverage Android to get more information about user location they will havethe double whammy over Facebook in terms of user data. Google already havemore ‘purchasing intent’ when people search on their web-site. Withknowledge of where the user is they will likely have an insurmountable leadin delivering targeted advertising to consumers. Facebook is desperate toget more knowledge about their users, surely location is the next big driverfor that success?

Consumers will win

For the consumer a FacePhone is win-win, even a commercial failure wouldforce other innovators such as Goole and Apple to move at a faster pace. ForFacebook the risks are great, but so are the rewards. Please Facebook, stepup, and show us what real innovation means in mobile + social.

A Language for describing D-Bus interfaces

A long time ago, while working on D-Bus accessibility, I becamefrustrated with the existing methods for specifying the interfaces. D-Businterfaces are normally specified using an XML format. Although this formatdescribes the D-Bus protocol perfectly well, it is lacking lots ofinformation that would be useful for D-Bus bindings and for documentation.To deal with this the D-Bus wizards decided that the XML could be litteredwith ‘annotations’, to extend the format and number of standard annotationssprang up around specific D-Bus bindings: EggDBus,Telepathy and QtDBus.

These annotation formats are difficult to read and edit. XML is moderatelyacceptable, but some restrictions of D-Bus XML make keeping the documentconsistent vey hard. The fact that they are limited to specific D-Buslibraries is also not ideal. I started work on a language for describingD-Bus interfaces that would address these issues. My idea was to have areadable syntax, enough features to clearly document a D-Bus interface, andtools to generate XML or code for the different D-Bus libraries.

After a long hiatus dbuf is finally in a state where the language parser iscomplete, and generation of D-Bus XML is supported. The source can be foundat There is adecent tutorial located in the doc folder, but a taste of what thelanguage is like follows. The code is part of a real example; the AT-SPID-Bus interface translated into dbuf.

using Attributes = org.freestandards.atspi.Attributes;
using Reference = org.freestandards.atspi.Reference;
/*  The base interface which is implemented by all  accessible objects.*/
interface org.freestandards.atspi.Accessible {    
  enum  Role { ROLE_INVALID = 0
/*      Represents a bit-field of currently held states.      TODO Could just be a uint64?    */
typedef uint32[] State;
/* A short string representing the object's name. */    
read property string Name;
/*The accessible object which is this objects containing parent.*/
read property Reference Parent;
/*Access this objects non-hierarchical relationships to other accessible objects.*/
method GetRelationSet reply { RelationSet relations; } 
/*Get the Role indicating the type of UI role played by this object*/  
method GetRole reply { Role role; }
/* Access the states currently held by this object. */
method GetState reply { State state }
/* Get a properties applied to this object as a whole, as an set name-value pairs. As such these attributes may be considered weakly-typed properties or annotations, as      distinct from the strongly-typed interface instance data.*/
method GetAttributes reply { Attributes attributes;}}

There is lots more work to do with dbuf, but even in its current state Ithink it is a useful tool for describing complex D-Bus interfaces.

A useful language with Antlr and Python

In a previous post Ishowed a small example of how to create a grammar and a parser using Antlrand Python. I made the point that simply parsing a file, and checking itconforms to the language grammar, isn’t usually all we want to do. As wellas creating parsers for a language Antlr also makes it easy to create moreuseful programs through actions.

To follow these examples you should take a look at my previous postwhere I used a very simple CSS like language grammar which is shown below.

grammar simplecss;options {    language = Python;}/* Parser rules */cssdocument : cssrule+ ;cssrule : IDENTIFIER '{' declarationlist '}' ;declarationlist : declaration+ ;declaration : IDENTIFIER ':' IDENTIFIER ';' ;/* Lexer rules */fragment LETTER : 'A'..'Z'                | 'a'..'z'                | '_'                ;fragment DIGIT : '0'..'9' ;IDENTIFIER : LETTER (LETTER | DIGIT)* ;COMMENT : '/*' .* '*/' {$channel = HIDDEN} ;WHITESPACE : ( 't' | ' ' | 'r' | 'n'| 'u000C' )+             {$channel = HIDDEN} ;

CSS is built up of selectors and a list of properties for selectors. Itseems that a useful data-type in Python would be a dictionary ofdictionaries. The outer set of keys being the selectors, and the inner setbeing the property names. The css banana {color:yellow; length:10cm;shape:curved;} would be translated to the Python data type: {'banana':{'color': 'yellow', 'length': '10cm', 'shape': 'curved'}}.

A simple way of doing this would be to add actions to the Antlr grammar. Anaction appears as a block of code in curly-braces. The position of theaction roughly equates to where in the parsing algorithm it will occur. Forexample we could print out the name of each css selector as we parse thedocument.

cssrule : i=IDENTIFIER {print $i.text}          '{' declarationlist '}' ;

Notice that the $i symbol refers to a variable outside the code block.This is substituted for the real IDENTIFIER variable when Antlr generatesthe parser. In this sense the code block can be thought of as a template forthe real code generated by Antlr. Detailed information about what symbolsare allowed in actions can be found in the Antlr documentation.Details about what attributes can be used on the parser and lexer rules isfound separately in the Attributes section. Thecode block will be executed after parsing the identifier, but before thedeclarationlist.

At this point we have to give a little thought to the code generated byAntlr from our grammar description. Each ‘rule’ within the grammar file istranslated in to a function on the parser class that takes the stream ofsymbols as the input. For example, when parsing a whole document, thetop-level rule function is used: res = parser.cssdocument (). Thecssdocument function corresponds to the grammar rule of the same name.These rule functions have return values. It is possible to add members tothe return type within the grammar file.

cssdocument returns [selectordict] : cssrule+ ;

To access this rule within an action the attribute selectordict is used asfollows.

cssdocument returns [selectordict] :{$cssdocument.selectordict = {}} cssrule+ ;

The problem with this is that the selectordict variable is only availablewithin the cssdocument rule. To make it available to sub-rules, where itis needed, it would have to be passed as an argument. This is cumbersome.The solution is Antlr scopes. A scope is a stack-type variable that isavailable to all sub-rules. The stack type is useful for language featuressuch as namespaces and variable scopes, hence the name. Scopes are declaredusing the scope keyword.

cssdocumentscope {    selectorscope;} : cssrule+ ;

Access for scopes is different to that of rule attributes and uses a syntaxlike C++ namespaces: $cssdocument::selectorscope.

Using actions, return-values and scopes it is now possible to create aparser that generates a python dictionary from a source file.

grammar simplecssactions;options {    language = Python;}/* Parser rules */cssdocument returns [ruledict]scope {    ruledictscope;    rulename;}@init {    $cssdocument::ruledictscope = {}    $cssdocument::rulename = ''} : cssrule+ ;cssrule : IDENTIFIER{    $cssdocument::rulename = $IDENTIFIER.text    $cssdocument::ruledictscope[$IDENTIFIER.text] = {}}'{' declarationlist '}' ;declarationlist : declaration+ ;declaration : p=IDENTIFIER ':' v=IDENTIFIER ';'{$cssdocument::ruledictscope[$cssdocument::rulename][$p.text] =[$v.text]};/* Lexer rules */fragment LETTER : 'A'..'Z'                | 'a'..'z'                | '_'                ;fragment DIGIT : '0'..'9' ;IDENTIFIER : LETTER (LETTER | DIGIT)* ;COMMENT : '/*' .* '*/' {$channel = HIDDEN} ;WHITESPACE : ( 't' | ' ' | 'r' | 'n'| 'u000C' )+            {$channel = HIDDEN} ;

The new syntax in this example is the @init block of the cssdocumentrule. This is called before anything else in the rule and is generally usedfor initialising local variables.

Using instructions given in my first blog post, it is now possible tocompile this grammar in-to a parser using Antlr. Once the parser has beengenerated a compiler can be written that outputs a Python dictionary.

import sysimport antlr3from simplecssactionsLexer import simplecssactionsLexerfrom simplecssactionsParser import simplecssactionsParserdef main (argv):    filename = argv[1]    input = antlr3.FileStream (filename)    lexer = simplecssactionsLexer (input)    tokens = antlr3.CommonTokenStream (lexer)    parser = simplecssactionsParser (tokens)    res = parser.cssdocument ()    print res    if __name__ == "__main__":        sys.exit (main (sys.argv))

As well as actions Antlr also has the ability to generate ASTs (AbstractSyntax Trees) from a conforming document. This is a tree data type thatsomewhat matches the syntax of the language. It is very useful when makingmultiple passes over the document, as it saves parsing it twice. Antlr hasspecial grammar file syntax for easily generating the desired AST from anylanguage.


I have had fun with Antlr. Despite only having rudimentary knowledge ofparsing and compiling I created a decent compiler for a language of my owndesign. Perhaps the most impressive part of Antlr’s parser generation is theerror reporting. Error reporting is extremely important not only foreventual users of any compiler but also during development. Antlr allowed meto create a professional feel to the program when I would not otherwise havehad the time.

There are some issues. Antlr is big, and tries to remain consistent betweenits different modes with only some success. The different data types ofToken, Rule, Tree, Template often caused me confusion. They are all handledin similar ways in the grammar files, but very differently in the eventualcode. The code generation is also sometimes difficult. There does not seemto be a well worked template system for this, possibly because the code usedisn’t restricted to any one particular language. The best example is thecommon use of the % symbol in python, which is also used in Antlr andcauses issues.

If you are thinking about playing about with your-own language grammar youshould probably try Antlr first. If you are lucky enough to be working in alanguage with a well worked parser combinator library then this may be moresuitable. I doubt many are as complete or actively-maintained as Antlr.

Recognising a language with Antlr and Python

A while ago I started writing an IDL for D-Bus, which was envisioned as afairly complex language. A large number of libraries and programs wereavailable to help but I settled on Antlr. I had tried flex / bison beforeand never had a good experience. Besides ruling out the ‘C’ version, thisalso discouraged me from using language implementations such as ply.

Antlr is a parser generator, similar to flex / bison, and written in Java.It has the advantage of having many ‘target’ languages. This means it ispossible to create parsers in many languages including Python, C, and Java.As a parser generator Antlr has its own language for describing languages.This is supposed to make the grammar of the language clear, and act as aspecification for the language as well as generating an implementation of aparser for it.

In Antlr the lexer and parser are described in the same file. The differencebetween a lexer and parser is technical but Antlr makes this distinction,parsing a language in two separate stages. The lexer is simpler and servesto split a sequence of characters into a sequence of words or tokens. Theparser can be more complicated and is used to ‘recognise’ a sequence ofsymbols that conform to the grammar of the language.

Descriptions of languages are made up of a sequence of ‘rules’ each of theserules what is permissible within the language. Lexer rules start with anupper-case character, and parser rules start with a lower-case character.Below is a grammar file that parses a very simple css-like language.

grammar simplecss;options {    language = Python;}/* Parser rules */cssdocument : cssrule+ ;cssrule : IDENTIFIER '{' declarationlist '}' ;declarationlist : declaration+ ;declaration : IDENTIFIER ':' IDENTIFIER ';' ;/* Lexer rules */fragment LETTER : 'A'..'Z'                | 'a'..'z'                | '_'                ;fragment DIGIT : '0'..'9' ;IDENTIFIER : LETTER (LETTER | DIGIT)* ;COMMENT : '/*' .* '*/' {$channel = HIDDEN} ;WHITESPACE : ( 't' | ' ' | 'r' | 'n'| 'u000C' )+             {$channel = HIDDEN} ;

Starting with the big-picture the first rule in this file is simplecss.This is the top-level rule and matches a sequence of one or more css rules.Antlr syntax is somewhat consistent with regexp and feels moderatelynatural. The symbol indicating one-or-more is +.

cssrule : IDENTIFIER '{' declarationlist '}' ;

The css rule itself is an IDENTIFIER token followed by a declarationlistinside braces. The '{' is a literal token within a parser rule.

In the parser rules it is acceptable to use sub-rules, such as whendeclaring a declarationlist as one or more of the rule declaration. Thisis not true for lexer rules. The lexer is used to process a stream ofcharacters in to a stream of tokens. Each lexer rule becomes a token. To usea lexer rule within others you must declare that it as a fragment. Thismeans that it will not become a token and is used just for composing otherlexer rules.

fragment DIGIT : '0'..'9' ;IDENTIFIER : LETTER (LETTER | DIGIT)* ;

From the above rule you it can be seen that an IDENTIFIER token is verysimple, just a word composed of letters and digits. As the IDENTIFER tokenis used in place of the selector, this language only allows very basicselectors.

COMMENT : '/*' .* '*/' {$channel = HIDDEN} ;WHITESPACE : ( 't' | ' ' | 'r' | 'n'| 'u000C' )+             {$channel = HIDDEN} ;

The above rules are taken from a page describing Antlr grammars. They serve to removecomments and extraneous whitespace from the token stream. This is veryuseful as comments and whitespace can appear almost anywhere within thelanguage. If these were processed in to tokens the parser rules would becomeexcessively complicated.

The documentation for Antlr grammars is excellent, as well as more involvedtutorials available on the Antlr website there isalso a bookavailable that does a decent job in this area. The list of operators used tocompose parser and lexer rules is found on the Antlr cheat sheet.

Generating Python code

Once the grammar of the language has been declared, Antlr is used togenerate programs that recognise this language. To indicate that a programshould be created in the Python language language = Python; is placed inthe grammar options. The default language is Java.

Antlr version 3 is available in the Ubuntu repositories in the antlr3package. I did not use this as it is a substantially older version. Thelatest can be downloaded from the project website. Some Python dependencies will also berequired: the Antlr runtime module and the Stringtemplate module. The Antlrruntime library needs to be exactly the same version as the Antlr tool. Thismay mean that it is not possible to use the very latest version of Antlrwith Python. I used version 3.1.2 of the Antlr tool and the Python runtimelibrary. Iused version 3.2.1 of the Stringtemplate module.

Once you have downloaded the Antlr tool as a jar you can run it directly.

> java -classpath antlr-3.1.2.jar org.antlr.Tool simplecss.g

This will generate two python files Make sure that the python dependencies are in thePYTHONPATH. It is now possible to write a script that checks if a fileconforms to the language grammar.

import sysimport antlr3from simplecssLexer import simplecssLexerfrom simplecssParser import simplecssParserdef main (argv):    filename = argv[1]    input = antlr3.FileStream (filename)    lexer = simplecssLexer (input)    tokens = antlr3.CommonTokenStream (lexer)    parser = simplecssParser (tokens)    res = parser.cssdocument ()if __name__ == "__main__":    sys.exit (main (sys.argv))

This tool is not very useful. All it can do is find out if a document iscorrect, and report errors if it is not. Antlr also has support forattaching actions to parser rules and translating a sequence of tokens in toan abstract syntax tree. A data structure that is easy to manipulate withinPython. In a future post I will show how to use actions and abstract syntaxtrees to create something a little more exciting.

Swanky Codethink offices

More than a year ago now Codethink moved to a new location in central Manchester. Its become a comfy home, but I don’t think we have ever introduced people. So without further ado here are some pictures of the ‘new’ Codethink offices.

I’ll hand out a prize of 20p and a curly-wurly to the first person to guess what Rob thinks is the most embarrassing item on our terribly disorganized bookshelf. I won’t lie to you either, the picture of a grey Manchester through a rainy window is a common occurrence. Still, the hacking is awesome, who needs Mediterranean weather patterns.

Tracker RDF database performance

I have spent time on-and-off the past week looking at the performance of the Tracker RDF database. Tracker, I believe, started out as a desktop search tool for Gnome. I never used it in this incarnation, and it has only come to my attention  since version 0.7, when the developers implemented a general purpose RDF storage engine at its core. I wanted to know how this newly implemented RDF database compared to a widely used RDF database in terms of query performance. In essence I was interested in whether the Tracker project had spawned something that could compete with Virtuoso and 4Store.

You can find my complete results and analysis at the tracker mailing list archive, but the headline statement is that tracker had roughly 9 times the query performance of Virtuoso. The graph here shows the breakdown by query.

This is a drastic difference in performance that greatly favours the home-grown database utilized by Tracker. However this stellar performance comes at the cost of flexibility. Its obvious that the database has been tailored very much to the needs of Tracker itself. Unlike Virtuoso it is not ‘schema-free’. A description of the data (In the form of something called an RDF ontology) is required for storage. In addition to this, the data formats are more restrictive, and some common elements of RDF are missing.

My general impression was that Tracker has great query performance, especially considering a tiny memory footprint. Unfortunately it is not suited to storage of pre-existing RDF data sets, such as those generated for semantic-web applications. This could well change in the future. Tracker, and its RDF database, are in heavy development. They already have speed and seemingly stability in the code-base. It might soon be time to add the new features that make it more generally applicable.

I should add that when I started this work I was heavily sceptical. Codethink have been highly involved in RDF, but I have not joined in. I have learned a-lot in the past few weeks, and this has made me more positive. I still believe that RDF might be too flexible for its own good, and I’ve found that the ontologies are onerous, complicated, and not very well specified. I did however come across a great post which explains some of the advantages of RDF over other data models; SPARQL is far more intuitive than its SQL cousin. If used to its potential, with highly interlinked data, I think it may be possible for the benefits of RDF to outweigh the tough learning curve.

Funding Gnome a11y

As many of you may have heard, from blogs by Eitan, Mike and Joanie, as well as an e-mail to the gnome-foundation-lists by Fernando, the Gnome a11y community is having a tough time.

I have been interacting with the a11y community for over two years now, and in that time the funding situation has never looked good. I do not wish to insult or demean companies that are no-longer involved in funding Gnome a11y. Companies and individuals have their own priorities that they must follow. Work they have done in the past on Gnome is very much appreciated by me, even if they cannot continue that work in-to the future.

That said, I believe that in the past two and a half years Gnome a11y has lost a huge amount of funding. First from IBM, which, to many peoples dismay, pulled out of a11y funding before I started work on AT-SPI. I was glad to hear that Mozilla is providing $10,000 to the Gnome foundation for a11y work. I’m extremely grateful for that, but I do not believe that Mozilla are providing the level of funding that they have done in the past. Our work on AT-SPI D-Bus has been funded jointly by Codethink, Sun, and another un-named benefactor. None of this funding is likely to continue past the end of February. All of this would seem slight were it not for the news that Oracle have let-go of important Gnome a11y community members working for the Sun Accessibility Project Office. Sun have been the major contributor to Gnome a11y, and this is a worrying signal that Oracle do not intend to continue the current level of contribution.

Assuming that Oracle do not wish to involve themselves in Gnome a11y, my back-of-the-envelope calculations indicate that we may have lost greater than $200,000 in anual funding over the last three years.

Although huge amounts of Gnome development takes place un-funded, by hackers, volunteers, users and hobbyists you would probably be surprised how much is done by folks working a 9-5. I don’t expect the figures to be the same, but as an example, 75% of kernel developers are paid by corporations for their work. The loss of the Sun Accessibility Project Office and other sources of funding will be felt very heavily by the Gnome a11y community.

Accessibility is incredibly important to the Gnome project, and not only to its users. Gnome has a fantastic, credible, accessibility story. This, to me, marks Gnome out as a class ‘A’ open-source project. Were we to lose this, it would be a turning point. In my eyes Gnome would then be a project in decline.

What can we do?

Firstly we need to go on a cohesive search for funding. The Linux Foundation has an accessibility group that I have been involved in for a long time. This seems to me the best place to combine our efforts in the great funding drive. Funding channeled through the Linux Foundation would not be Gnome specific, but cross desktop a11y technology is what we have long been striving for.

Ideally enough funding would be found to hire someone to work full time on Linux Desktop accessibility.

Outside of the search for cash all Gnome developers need to spend more time on accessibility. Its not always easy to make ones application accessible, and I’m sure it can seem daunting. There are still a11y community members ready to help out though. All is not lost. :) I’m damn near certain that we are going to pull together. Gnome 3.0 will have the same great accessibility that has made me proud of past Gnome releases.