Akoma Ntoso, Hackathon, Standards

International Open Standards Hackathon

An international open standard for legislative documents will be an important next step for making legislative information available to the people. An open standard will promote the creation of tools and services worldwide that will enable citizen participation in the legislative process and will enhance how governments make the laws they produce more transparent.

Today there is a great deal of inconsistency in how open participation and transparency are achieved around the world. Putting cultural and political differences aside, part of the reason for the inconsistency is the tremendous effort and cost involved in building the infrastructure to support these new objectives. An open standard will start to solve this problem by promoting the establishment of a real legal informatics industry of interoperable tools and services which can easily be put together to address whatever committment a government has made to open and transparent government.

Akoma Ntoso is an emerging standard the promises to do just this. It is an XML schema, developed by the University of Bologna in Italy. It was developed for Africa i-Parliaments, a project sponsored by United Nations Department of Economic and Social Affairs. In the coming weeks, an OASIS technical committee will begin the process of turning this standard into an international standard. I am a participant on that TC.

To further promote and to publicize Akoma Ntoso, I am working with Ari Hershowitz #arihersh to stage an international hackathon within the next few months. The idea is to provide an event for people that will demystify XML and Akoma Ntoso (yes, it is hard to say) by providing a really easy way for anyone to create a document using the proposed standard. Our goal will be to collect a world’s worth of legislative samples. This could be an important step towards building a library that stitches together all the world’s laws and regulations in an open and transparent way.

We’re currently seeking sponsors, participants, and venues for this hackathon. The interest we have found so far has been quite amazing. If you are interested in helping us make this event a success, please let either Ari or me know.

Standard
Akoma Ntoso, Standards

International Meeting on Transparency and the use of Open Document Standards

This past week was a very interesting week for me. I attended the meeting “Achieving Greater Transparency in Legislatures through the Use of Open Document Standards” at the U.S. House of Representatives in Washington DC. The meeting was sponsored by the United Nations, the Inter-Parliamentary Union, and the U.S. House of Representatives.

Meeting Participants

For me it was a valuable opportunity to meet with colleagues I had already met in my travels in recent years, to meet with people I knew of but had never met, and to meet people I had corresponded with only through email. Particularly special for me was to finally meet Tim Arnold-Moore. It was by reading his thesis on “Information Systems for Legislation” back in 2001 that I became aware of the field of legal informatics.

It was quite fascinating to see the different ways in which different countries were approaching transparency. As one would expect, the American approach is a bit heavy-handed with a focus on providing access to existing documentation. It seems that the real innovation is coming from smaller or younger countries that are less encumbered by the top-down bureaucracy that tends to squash out more cost-effective innovation.

Two systems caught my attention in particular:

  • The first was the system put in place by the Brazilian Chamber of Deputies. More than merely providing their citizens with visibility to the workings of their government, this system focused on ensuring two way interaction between citizens and their elected representatives – even going so far as to allow citizens to express their viewpoints by way of YouTube-style video clips.
  • The other system to get my attention, of course, was Bungeni. This is an open source Legislative Information System developed in Nairobi, Kenya for use in the parliaments of Africa (and elsewhere in the future). It is based on the Open Office word processor, but does the most credible job of turning a word processor into an XML editor that I have seen so far. Of course, it works with Akoma Ntoso which was developed alongside it.

Speaking of Akoma Ntoso, it came up plenty of times during the conference. Monica Palmirani and Fabio Vitali from the University of Bologna in Italy presented various aspects of the XML Schema. On Tuesday the OASIS LegalDocumentML TC was opened to drive it towards being an international standard. You can read about it here.

On the Thursday after the meeting wrapped up, Knowledge as Power held a class on Legislative XML at the National Democratic Institute. I presented the work I had done on Legix.info including the transform of the U.S. Code into Akoma Ntoso.

All in all, the four days I spent in Washington D.C. were very worthwhile. Hopefully the outcome of that meeting will be better cooperation in the field of legal informatics. Clearly, after a dozen or so years of experience with XML, the time has come to start moving beyond the tentative first steps which have defined this field towards open standards and the benefits that come when everyone works towards common goals.

Standard
Akoma Ntoso, Standards

And now for something completely different… Chinese!

Last week we saw how Akoma Ntoso can be applied to a very large consolidated Code – the United States Code. This week we take the challenge in a different direction – applying Akoma Ntoso to a bilingual implementation involving a totally different writing system. Our test document this week is the Hong Kong Basic Law. This document serves as the constitutional document of the Hong Kong Special Administrative Region of the People’s Republic of China. It was adopted on the 4 April 1990 and went into effect on July 1, 1997 when the United Kingdom handed over the region to the People’s Republic of China.

The Hong Kong Basic Law is available in English, Traditional Chinese, and Simplified Chinese. For our exercise, we are demonstrating the document in English and in Traditional Chinese. (Thank you to Patrick for doing the conversion for me.) Fortunately, using modern technologies, supporting Chinese characters alongside Latin characters is quite straightforward. Unicode provides a Hong Kong supplementary character set to handle characters unique to Hong Kong. The biggest challenge is ensuring that all the unicode declarations throughout the various XML and HTML files that the information must flow through are set correctly. With the number of accents we find in names in California as well as the rigorous nature of California’s publishing rules, getting Unicode right is something we have grown accustomed to.

While I hadn’t expected there to be any problems with Unicode, I was pleasently surprised to find that the fonts used in Legix simply worked with the Traditional Chinese characters without issue as well. (Well at least as far as I can tell without the ability to actually read Chinese)

The only issue we encountered was Internet Explorer’s support for CSS3. Apparently, IE still does not recognize “list-style-type” with a value of “cjk-ideographic”. So instead of getting Traditional Chinese numerals, we get Arabic numerals. The other browsers handled this much better.

So what other considerations were there? A big consideration was the referencing mechanism. To me, modeling how you refer to something in an information model can be more important than the information model itself. The referencing mechanism defines how the information is organized and allows you to address a specific piece of information in a very precise and accurate way. Done right, any piece of information can be accessed very quickly and easily. Done wrong and you get chaos.

Our referencing mechanism relies on the Functional Requirements for Bibliographical Records (FRBR). This mechanism is used by both SLIM and Akomantoso. Another interesting FRBR proposal for legislation can be found here.

FRBR defines an information model based on a hierarchical scheme of Work-Expression-Manifestion-Item. Think of the work as the overall document being addressed, the expression being the version desired, the manifestation the format you want to information presented in, and finally the item as a means for addressing a specific instance of the information. Typically we’re only concerend with Work-Expression-Manifestation.

For a bilingual or multilingual system, the “expression” part of the reference is used to specify which language you wish the document to be returned in. If you check out the references at Legix.info you will see that the two references the the Hong Kong Basic Law are:

The expressions are called out as “doc;en-uk” for the English version and “doc;zh-yue” for the Chinese version. Relatively straightforward. The manifestations are not shown and the result is the default manifestation of HTML.

Check the samples out and let me know what you think.

Standard
Akoma Ntoso

Applying Akoma Ntoso to the United States Code

A few weeks ago the U.S. House of Representative’s Committee on House Administration held a one day Legislative Data and Transparency Conference. While I was not able to attend in person, I did listen in to the presentation via the live stream that was provided.

Of all the things I learned that day, one specific detail intrigued me the most – that there is an XML representation of the United States Code that has been made available. This XML is available at http://uscode.house.gov/xml. While the data is a little stale and some titles are mysteriously absent (Title 14, the repealed Title 34, and Title 51), it is a great source to begin experimenting with the United States Code.

One question asked by Sarah Schacht of Knowledge As Power was why there wasn’t very much interest in Akoma Ntoso at the federal level. For me, the answer wasn’t altogether satisfying but it did gave me an idea! How about I try to transform the XML files that are available into Akoma Ntoso as best I know how. That way, I could learn for myself how well Akoma Ntoso adapts to the needs of the US federal government. Admittedly the US Code is only one aspect of the overall issue, but it is a reasonable place to start.

The effort took me just a few days and now I have (almost) the full United States Code available in Akoma Ntoso. You can find it on my Legix.info site under United States Laws. Click on the “AKN” link in the upper right of each file to see tha Akoma Ntoso rendition. As a bonus, I also updated the United States Constitution that we had already prototyped to use the latest transforms for federal data. With thanks to Monica Palmirani at the University of Bologna and Flavio Zeni at UNDESA for their help, I was able to get what I think is a fairly reasonable rendition of the United States Code in Akoma Ntoso. Additionally, I have transforms into the SLIM formats and into an HTML presentation format.

So what have I learned? First of all, Akoma Ntoso adapted quite easily to the hierarchical model of the United States Code. That isn’t too surprising as the U.S. Codes hierarchy isn’t unusual and Akoma Ntoso is quite flexible in this regard. However, I do have an issue with managing a document as large as the United States Code. From what I can tell, the component mechanism within Akoma Ntoso simply doesn’t adapt to modeling a very large code. I need some sort of composition or inclusion mechanism that would allow the single US Code to be modeled as a composite document made up of many files, preferably in some sort of hierarchical arrangement. Currently I have modeled the US Code as 48 different “Acts” corresponding to the available titles within US Code, but this is far from ideal. Modeling the individual titles as acts is not accurate and still does not resolve the scalability issues but is the best I cold figure out at this time. In the past I found Monica and her team to be quite responsive to issues such as this so hopefully we will have a quick resolution to this. Maybe I simply don’t know enough about Akoma Ntoso to model a large document adequately.

My effort is just a start. I still have lots to learn about how best to apply Akoma Ntoso in various contexts. I will be refining my transforms as time allows in the weeks to come. Take a look at what I have done and let me know what you think. I welcome all feedback, both constructive and otherwise. My intent in publishing all the experiments and research that we do at Xcential is to share what we know with the legal informatics community in the hopes of fostering a more collaborative spirit amongst us all. So please send me your comments!

Standard
Akoma Ntoso, Process, Standards

Legislative Information Modeling

Last week I brought up the subject of semantic webs for legal documents. This week I want to expand the subject by discussing the technologies that I have encountered recently that point the way to a semantic web. Of course, there are the usual general purpose semantic web technologies like RDF, OWL, and SPARQL. Try as I might, I have been unable to get much practical interest out of anyone in these technologies. Part of the reason is that the abstraction they demand is just beyond most people’s grasp at this point in time. In academic circles it becomes easy to discuss these topics, but step into the “real world” and interest evaporates immediately.

Rather than pressing ahead with those technologies, I have chosen in recent years to step away and focus more on less abstract and more direct information modeling approaches. As I mentioned last week, I see two key areas of information modeling – the documents and the relationships between them. In some respects, there are three areas – distinguishing the metadata about the documents from the documents themselves. Typically I lump the documents with their metadata because much of the metadata gets included with the document text blurring the distinction and calling for a more uniform integrated model.

The projects I have worked on over the past decade have resulted in several legislative information models. With each project I have learned and evolved to result in the SLIM model found at the Legix.info demonstration website that exists today. Over time, a few key aspects have emerged as most important:

  • First and foremost has been the need for simplicity. It is very easy to get all caught up with the information model, discovering all the variations out there and finding clever solutions to each and every situation. However, it easily becomes possible to end up with a large and complex information model that you cannot teach to anyone that does not share your passion and experiences in information modeling. Your efforts to satisfy everyone result in a model that satisfies no one due to the resulting complexity of trying to please too many masters.
  • Secondly, you need to provide a way to build familiarity into your information model. While there are many consistently used terms in legislation, at the same time, traditions around the world do vary and sometimes very similar words have quite different meanings to different organizations. Trying to change long standing traditions to arrive at more consistent or abstract terminology always seems to be an uphill battle.
  • Thirdly, you have to consider the usage model. Is the model intended for downstream reporting and analysis or does the model need to work in an editing environment? An editing model could be quite different from a model intended only for downstream processing. The reason for this is that the manner in which the model will interact with the editor must be given important consideration. Two important aspects require consideration. First, the model must be robust yet flexible enough to work with all the intermediate states that a document will exist at whilst being edited. Second, change tracking is a very important consideration during the amendment process and how that function will be implemented in the document editor must be considered.

While I have developed SLIM and its associated reference scheme over the past few years, in the last year I have started experimenting with a few alternate models in the hopes of finding a more perfect model to solve the problem of legislative information modeling. Most recently I have started experimenting with Akoma Ntoso developed by Fabio Vitali and Monica Palmirani at the University of Bologna. This project is supported by Africa i-Parliaments, a project sponsored by United Nations Department of Economic and Social Affairs. I very much like this model as it follows many of the same ideals in terms of good information modeling that I try to conform to. In fact, it is quite similar to SLIM in many respects. The legix.info site has many examples of Akoma Ntoso documents, created by translating SLIM into Akoma Ntoso via an XSL Transform.

While I very much like Akoma Ntoso, I have yet to master it. It is a far more ambitious effort than SLIM, has many more tags, and covers a broader range of document types. Like SLIM, it covers both the metadata and the document text in a uniform model. I have yet to convince myself as to its viability as an editing schema. Adapting it to work with the editors I have worked with in the past is a project I just haven’t had the time for yet.

The other important aspect of a semantic web, as I wrote about last week is the referencing scheme. Akoma Ntoso uses a notation based on coded URLs to implement referencing. It is partly based on a conceptually similar model URN:LEX model based around URNs developed by Enrico Francesconi and Pierluigi Spinosa at the ITTIG/CNG in Florence, Italy. Both schemes build upon the Functional Requirement for Bibliographic Records (FRBR) model. I have tried adopting both models but have run into snags with the models either not covering enough types of relationships, scaring people away with too many special characters with encoded meaning, or resulting in too complex a location resolution model for my needs. At this point I have cherry picked the best features of both to try and arrive at a compromise that works for my cases. Hopefully I will be able to evolve towards a more consistent implementation as those efforts mature.

My next effort is to start taking a closer look at MetaLex, an open XML-based interchange format for legislation. It has been developed in Europe and defines a set of conventions for metadata, naming, cross references, and compound documents. Many projects in Europe including Akoma Ntoso comply with the Metalex framework. It will be interesting for me to see how easily I can adapt SLIM to Metalex. Hopefully the changes required will amount mostly to deriving from the Metalex schema and adapting to its attribute names. We shall see…

Standard