Akoma Ntoso, HTML5, LegisPro Sunrise, Standards, technology, Track Changes, Uncategorized, W3C

LegisPro Sunrise!!!

LegisPro Sunrise is almost done!. It has taken longer than we had hoped it would, but we are finally getting ready to begin limited distribution of LegisPro Sunrise, our productised implementation of our LegisPro drafting and amending tools for legislation and regulations. If you are interested in participating in our early release program, please contact us at info@xcential.com. If you already signed up, we will be contacting you shortly.

LegisPro Sunrise is a desktop implementation of our web-based drafting and amending products. It uses Electron from GitHub, built from Google’s Chromium project, to bundle all the features we offer, both the client and server sides into a single easy-to-manage desktop application with an installer having auto-update facilities. Right now, the Windows platform is supported, but MacOS and Linux support will be added if the demand is there. You may have already used other Electron applications – Slack, Microsoft’s Visual Studio Code, WordPress, some editions of Skype, and hundreds of other applications now use this innovative new application framework.

Other versions

In addition to the Sunrise edition, we offer LegisPro in customised FastTrack implementations of the LegisPro product or as fully bespoke Enterprise implementations where the individual components can be mixed and matched in many different ways.

Akoma Ntoso model

LegisPro Sunrise comes with a default Akoma Ntoso-based document model that implements the basic constructs seen in many parliaments and legislatures around the world that derive from the Westminster parliamentary traditions.

Document models implementing other parliamentary or regulatory traditions such as those found in many of the U.S. states, in Europe, and in other parts of the world can also be developed using Akoma Ntoso, USLM, or any other well-designed XML legislative schema.

Drafting & Amending

Our focus is on the drafting and amending aspects of the parliamentary process. By taking a digital-first approach to the process, we are able to offer many innovative features that improve and automate the process. Included out of the box is what we call amendments in context where amendment documents are extracted from changes recorded in a target document. Other features can be added through an extensive plugin mechanism.

Basic features

Ease-of-use

While offering sophisticated drafting capabilities for legislative and regulatory documents, LegisPro Sunrise is designed to provide the familiarity and ease-of-use of a word processor. Where it differs is in what happens under the covers. Rather than drafting using a general purpose document model and using styles and formatting to try to capture the semantics, we directly capture the semantic structure of the document in the XML. But don’t worry, as a drafter, you don’t need to know about the underlying XML – that is something for the software developers to worry about.

Templates

Templates allow the boilerplate structure of a document to be instantiated when creating a new document. Out-of-the-box, we are providing generic templates for bills, acts, amendments, amendment lists, and a few other document types.

In addition to document templates, component templates can be specified or are synthesized when necessary to be used as parts when constructing a document.

For both document and component templates, placeholders are used to highlight area where text needs to be provided.

Upload/Download

As a result to our digital-first focus, we manage legislation as information rather than as paper. This distinction is important – the information is held in XML repositories (a form of a database) where we can query, extract, and update provisions at any level of granularity, not just at the document level. However, to allow for the migration from a paper-oriented to a digital first world, we do provide upload and download facilities.

Undo/Redo

As with any good document editor, unlimited undo and redo is supported – going back to the start of the editing session.

Auto-Recovery

Should something go wrong during an editing session and the editor closes, an auto-recovery feature is provided to restore your document to the state the document was at, or close to it, when the editor closed.

Contextual Insert Lists

We provide a directed or “correct-by-construction” approach to drafting. What this means is that the edit commands are driven by an underlying document model that is defined to enforce the drafting conventions. Wherever the cursor is in the document, or whatever is selected, the editor knows what can be done and offers lists of available documents components that can either be inserted at the cursor or around the current selection.

Hierarchy

Document hierarchies form an important part of any legislative or regulatory document. Sometimes the hierarchy is rigid and sometimes it can be quite flexible, but either way, we can support it. The Sunrise edition supports the hierarchy Title > Part > Chapter > Article > Section > Subsection > Paragraph > Subparagraph out of the box, where any level is optional. In addition, we provide support for cross headings which act as dividers rather than hierarchy in the document. Customised versions of LegisPro can support whatever hierarchy you need — to any degree of enforcement. A configurable promote/demote mechanism allows any level to be morphed into other levels up and down the hierarchy.

Large document support

Rule-making documents can be very large, particular when we are talking about codes. LegisPro supports large documents in a number of ways. First, the architecture is designed to take advantage of the inherent scalability of modern web browsing technology. Second, we support the portion mechanism of Akoma Ntoso to allow portions of documents, at any provision level, to be edited alone. A hierarchical locking mechanism allows different portions to be edited by different people simultaneously.

Spelling Checker

Checking spelling is an important part of any document editor and we have a solution – working with a third-party service we have tightly integrated with in order to give a rich and comprehensive solution. Familiar red underline markers show potential misspellings. A context menu provides alternative spellings or you can add the word to a custom dictionary.

Tagging support

Beyond basic drafting, tagging of people, places, or things referred to in the document is something for which we have found a surprising amount of interest. Akoma Ntoso provides rich support for ontologies and we build upon this to allow numerous items to be tagged. In our FastTrack and Enterprise solutions we also offer auto-tagging technologies to go with the manual tagging capabilities of LegisPro sunrise.

Document Bar

The document bar at the top of the application provides access to a number of facilities of the editor including undo/redo, selectable breadcrumbs showing your location in the document, and various mode indicators which reflect the current editing state of the editor.

Command Ribbons and Context Menus

Command ribbons and context menus are how you access the various commands available in the editor. Some of the ribbons and menus are dynamic and change to reflect the location the cursor or selection is at in the editor. These dynamic elements show the insert lists and any editable attributes. Of course, there is also an extensive set of keyboard shortcuts for many commonly used commands. It has been our goal to ensure that the majority of the commonly used documenting tasks can be accomplished from the keyboard alone.

Sidebar

A sidebar along the left side of the application provides access to the major components which make up LegisPro Sunrise. It is here that you can switch among documents, access on-board services such as the resolver and amendment generator, outboard services such as the document repository, and where the primary settings are managed.

Side Panels

Also on the left side are additional configurable side panels which provide additional views needed for drafting. The Resources view is where you look up documents, work with the hierarchy of the document being edited, and view provisions of other documents. The Change Control view allows the change sets defined by the advanced change control capabilities (described below) to be configured. Other panels can be added as needed.

Advanced features

In addition to the rich capabilities offered for basic document editing, we provide a number of advanced features as well.

Document Management

Document management allows documents to be stored in an XML document repository. The advantage of storing documents in an XML repository rather than in a simple file share or traditional content management system is that it allows us to granularise the provisions within the documents and use them as true reference-able information – this is a key part of moving away from paper document-centric thinking to a modern digital first mindset. An import/export mechanism is provided to add external documents to the repository or get copies out. For LegisPro Sunrise we use the eXist-db XML database, but we can also provide customised implementation using other repositories.

Resolver

Our document management solution is built on the FRBR-based metadata defined by Akoma Ntoso and uses a configurable URI-based resolver technology to make human readable and permanent URIs into actual URLs pointing to locations within the XML repository or even to other data sources available on the Web.

Page & Line numbering

There are two ways to record where amendments are to be applied – either logically by identifying the provision or physically by page and line numbers. Most jurisdictions use one or the other, and sometimes even both. The tricky part has always been the page and line numbers. While modern word processors usually offer page and line numbers, they are dynamic and change as the document is edited. This makes this feature of limited use in an amending system. What is preferred is static page and line numbers that reflect the document at the last point it was published for use in a committee or chamber. We accomplish this approach using a back-annotation technology within the publishing service. LegisPro Sunrise also offers a page and line numbering feature that can be run without the publishing service. Page and line numbers can be display in the left or right margin in inline, depending on preference.

Amendment Generation

One of the real benefits of a digital first solution is the many tasks that can be automated – not by simply computerising the way things have always been done, but by rethinking the approach altogether. Amending is one such area. LegisPro Sunrise incorporates an onboard service to automatically generate amendment documents from changes recorded in the target document. Using tracked changes, the document hierarchy, and annotated page and line numbers, we are able to very precisely record proposed changes as amendments. Of course, the amendment generator works with the change sets to allow different amendment sets to be generated by specifying the named set of changes.

Plugin Support

LegisPro Sunrise is not the first incarnation of our LegisPro offering. We’ve been using the underlying technologies and precursors to those technologies for years with many different customers. One thing we have learned is that there is a vast variation in needs from one customer to another. In fact, even individual customers sometime require very different variations of the same basic system to automate different tasks within their organisation. To that end, we’ve developed a powerful plug-in approach which allows capabilities to be added as necessary without burdening the core editor with a huge range of features with limited applicability. The plugin architecture allows onboard and outboard services to be added, individual commands, menus, menu items, side panels, mode indicators, JavaScript libraries, and text string libraries to be added. In the long term, we’re planning to foster a plugin development community.

Proprietary or Open Source?

There are two questions that always come up relating to our position on standards and open source software:

Is it based on standards? Yes, absolutely – almost to a fault. We adhere to standards whenever and however we can. The model built into LegisPro Sunrise is based on the Akoma Ntoso standard that has been developed over the past few years by the OASIS LegalDocML technical committee. I have been a continual part of that effort since the very beginning. But beyond that, we always choose standards-based technologies for inclusion in our technology stack. This includes XML, XSLT, XQuery, CSS, HTML5, ECMAScript 2015, among others.
Is it open source?
- If you mean, is it free, then the answer is only yes for evaluation, educational, and non-production uses. That’s what the Sunrise edition is all about. However, we must fund the operation of our company somehow and as we don’t sell advertising or customer profiles to anyone, we do charge for production use of our software. Please contact us at info@xcential.com or visit our website at xcential.com for further information on the products and services we offer.
- If you mean, is the source code available, then the answer is also yes – but only to paying customers under a maintenance contract. We provide unfettered access to our GitHub repository to all our customers.
- Finally, if you’re asking about the software we built upon, the answer is again yes, with a few exceptions where we chose a best-of-breed commercial alternative over any open source option we had. The core LegisPro Sunrise application is entirely built upon open source technologies – it is only in external services where we sometimes rely up commercial third-party applications.

What does it cost?

As I already alluded to, we are making LegisPro Sunrise available to potential customers and partners, academic institutions, and other select individuals or organisations for free – so long as it’s not used for production use — including drafting, amending, or compiling legislation, regulations, or other forms of rule-making. If you would like a production system, either a FastTrack or Enterprise edition, please contact us at info@xcential.com.

Coming Soon

I will soon also be providing a pocket handbook on Akoma Ntoso. As a member of the OASIS LegalDocML Technical Committee (TC) that has standardised Akoma Ntoso, it has been important to get the handbook reviewed for accuracy by the other TC members. We are almost done with that process. Once the final edits are made, I will provide information on how you can obtain your own copy.

Standards, technology, Uncategorized, W3C

The many lives of JavaScript

I recently worked out that I’ve learned, on average, a new programming language every two to three years. These many languages have been part of my toolbox for somewhere between four to six years before falling away to make room for new technologies. However, there is one programming language that has been a major part of my programming repertoire for almost 22 years now – and that is JavaScript.

My JavaScript programming skills have recently undergone a major renaissance as I’ve adopted JavaScript 6 (aka ECMAScript 2015), for most of my coding. The way I write code today is nothing like the code I wrote just one year ago – and I’ve gone back and largely modernised all active code to be consistent. Today’s programming style uses modern frameworks and is far more object oriented and asynchronous. There are many new features which have totally updated how I write code. Proper (while still limited) classes with mixins have replaced the ugly prototype mechanism I used to use for object orientation. Let and const declarations have caught latent bugs that were hidden in my code. Arrow functions (aka. lambda expressions) and promises have streamlined code that once was quite clunky. The list goes on…

Even my tools have changed. Microsoft’s surprisingly excellent Visual Studio Code has replaced the hodgepodge of tools I once used. We’re in the process of integrating Jasmine and Karma to the process. JavaScript Semistandard Style (no, I still like semicolons) has ensured a very clean code base – as well as catching a multitude of errors and sins.

LifeOfJavaScript

All this change got me thinking about the four lives of JavaScript that I have worked through. Way back when, JavaScript had an awkward birth at the hands of Netscape as the lesser stepchild of the new Java programming language from Sun that was taking away all the attention. JavaScript was just a way to glue Java applets together in the browser. The problem is, Java applets really sucked.

Microsoft quickly saw the value of JavaScript though, and launched their own effort to steal Netscape’s baby. And so, JavaScript was stolen, renamed JScript and made to be the adopted sibling of Microsoft’s other scripting language, VBScript. One good bit progress that Microsoft made was to sponsor the standardisation of the language, although the resulting name of ECMAScript was another in a long string of unfortunate names the language has had to endure.

As JScript, JavaScript was to become an integral part of Microsoft’s entire ActiveX strategy. A lot of really cool technologies (yes, really) came of this allowing JScript to go beyond the browser. As an application extension language, it found its way into the XMetaL XML editor as the customisation technology. We used it and many of the ActiveX technologies to great effect when we implemented California’s bill drafting system. However, it didn’t just end there. We were able to use it on the server-side through Classic ASP and as a shell scripting language through the Windows Script Host. For a Microsoft-centric programmer, this era of JavaScript was a glorious one.

However, ActiveX was seriously flawed. It was entirely proprietary and riddled with problems. Microsoft abandoned it almost as quickly as they had adopted it – moving on to .Net where JScript.net was a non-starter. As Microsoft’s interest in ActiveX and even Internet Explorer waned in the early 2000s, life as a JavaScript programmer became ever gloomier. While the capabilities were awesome, there was obviously no future.

At this point, we made the somewhat painful decision to move away from Microsoft’s outmoded view of the Internet and go back to the basics. While it meant giving up a lot of capability, in the end it was an excellent decision for it pointed to the future. One tiny aspect of Microsoft’s ActiveX vision, the XMLHttpRequest object, escaped from Microsoft and gave rise to a whole different way of programming – Asynchronous JavaScript and XML (AJAX). This development and the emergence of new browsers, first Firefox and then Google’s Chrome with its V8 JavaScript engine, breathed new life into JavaScript.

Freed from Microsoft’s grip, JavaScript has flourished. The past decade has seen a plethora of new technologies. Isomorphic JavaScript (or Universal JavaScript) blurs the distinction between coding for the server and the client. In fact, technologies like Electron turn web-based application development back to the desktop where you can get the best of both worlds.

When I look back on the code I wrote during the ActiveX era (yes, we still support it), it looks prehistoric. Modern JavaScript is so much more capable and flexible than the clunky rendition we had back when COM-based ActiveX was supposed to change the world. As I mentioned earlier, how I program now is completely different – asynchronous programming is a difficult but very worthwhile skill to acquire.

Looking to the future, I see three paths. On one side is a mature but polarising platform that is dominated by Oracle. Oracle’s dominance ensures stability but also deters innovation. Looking to the other side, one finds another mature but polarising platform that is dominated by Microsoft. Here too, Microsoft’s dominance ensures stability but also deters innovation. The result is that it seems that both paths have now had their heyday. You don’t hear very much aspirational news from either technology path anymore — what it must have felt like programming a mainframe in COBOL at the height of the C/C++ era.

The third path seems to be the path of the future – staking out a middle ground that neither technology giant can stomp on. Sure, Google is a technology giant that plays a strong role, but they’re still reasonably well regarded by the development community at large (for now). It is this middle ground that has been the most fertile for new technologies – and JavaScript is right in the thick of it. There are so many new technologies it’s hard to keep track of them all — AngularJS, Node.js, React, Express.js, to name but a few. While this third path can play well with both of the other two, for me it is the path that truly points to the future.

This brings us to the fourth life for JavaScript – building on the momentum of the past decade to mount a credible challenge for enterprise apps. While I initially dismissed many of the new features of the language as mere syntactic sugar, my experience with it has shown it to be more. I now write much better code. I believe we’re on the verge of an explosion in JavaScript-enabled applications that will blur the distinction between the platforms, between the desktop and the browser, and between the server and the client. This is truly an exciting time, once more, to be developing in JavaScript.

It goes without saying, but stay tuned for more…

Process, Standards, Transparency, W3C

Connected Information

As a proponent of XML for legislation, I’m often asked why an XML approach is better than a more traditional approach using a word processor. The answer is simple – it’s all about connected information.

The digital end point in a legislative system can no longer be publication of PDFs. PDFs are nothing but a kludgy way to digitize paper — a way to preserve the old traditions and avoid the future. Try reading a PDF on a cell phone and you see the problem. Try clicking on a citation in a PDF and you see the problem. Try and scrape the information out of a PDF to make it computer readable and you see the problem. The only useful function that PDFs serve is as a bridge to the past.

The future is all about connected information — breaking the physical bounds of what we think of as a document and allowing the nuggets of information found within them to be connected, interrelated, and acted upon. This is the real reason why the future lies with XML and its related technologies.

In my blog last week I provided a brief glimpse into how our future amending tools will work. I explored how legislation could be managed similar to how software is managed with GitHub. This is an example of how useful connected information becomes. Rather than producing bills and amendments as paper documents, the information is stored in a way that it can be efficiently and accurately automated — and made available to the public in a computer readable way.

At Xcential, we’re building our new web-based authoring system — LegisPro. If you take a close look at it, you’ll see that it has two main components. Of course, there is a robust XML editor. However, at the system’s very heart is a linking system — something we call a resolver. It’s this resolver where the true power lies. It’s an HTTP-based system for managing all the linkages that exist in the system. It connects XML repositories, external data sources, and even SQL databases together to form a seamless universe of connected information.

We’re working hard to transform how legislation, and indeed, all government information is viewed. It’s not just about connecting laws and legislation together through simple web links. We talking about providing rich connections between all government information — tying financial data to laws and legislation, connecting regulatory information together, associating people, places, and things to government data, and on and on. We have barely started to scratch the surface, but it’s clear that the future lies with connected information.

While we today position LegisPro as a bill authoring system — it’s much more than that. It’s some of the fundamental underpinnings necessary for a system to transform government documents of today into the connected information of tomorrow.

Akoma Ntoso, HTML5, LegisPro Web, LEX Summer School, Standards, Track Changes, W3C

Coming soon!!! A new web-based editor for Akoma Ntoso

I’ve been working hard for a long time — building an all new web-based editor for Akoma Ntoso. We will be showing it for the first time at the upcoming Akoma Ntoso LEX Summer School in Washington D.C.

Unlike our earlier AKN/Editor, this editor is a pure XML editor designed from the ground up using the XML capabilities that modern browsers possess. This editor is much more robust, more precise, and is very scalable.

Basic Features

Configurable XML models — including Akoma Ntoso and USLM
Edit full documents or portions of large documents
Flexible selection and editing regardless of XML structure
Built-in redlining (change tracking) supporting textual AND structural changes
Browse document sources with drag-and-drop.
Full undo & redo
Customizable attribute editor
Search and replace
Modular architecture to allow for extensive customization

Underlying Technology

XML-based editing component
- DOM 4 support
- XPath Support
- CSS Styling
- Sophisticated event model
HTTP-based resolver architecture for retrieving documents
- Interpret citations
- Deference URLs
- WebDAV adaptors to document repositories
- Query repositories with XQuery or databases with SQL
AngularJS-based User Interface using HTML5
- Component modules for easy customization
XML repository for storing documents
- Integrate any XML repository
- Built-in support for eXist-db
Validation & Publishing
- XML Schema validator
- XSL-FO publishing

We’ll reveal a lot more at the LEX Summer School later this month! If you’re interested in our QuickStart beta program, drop me a note at grant.vergottini@xcential.com.

Akoma Ntoso, LEX Summer School, Process, Standards, Track Changes, Transparency, W3C

Achieving Five Star Open Data

A couple weeks ago, I was in Ravenna, Italy at the LEX Summer School and follow-on Developer’s Workshop. There, the topic of a semantic web came up a lot. Despite cooling in the popular press in recent years, I’m still a big believer in the idea. The problem with the semantic web is that few people actually get it. At this point, it’s such an abstract idea that people invariably jump to the closest analog available today and mistake it for that.

Tim Berners-Lee (@timberners_lee), the inventor of the web and a big proponent of linked data, has suggested a five star deployment scheme for achieving open data — and what ultimately will be a semantic web. His chart can be thought of as a roadmap for how to get there.

Take a look at today’s Data.gov website. Everybody knows the problem with it — it’s a pretty wrapper around a dumping ground of open data. There are thousands and thousands of data sets available on a wide range of interesting topics. But, there is no unifying data model behind all these data dumps. Sometimes you’re directed to another pretty website that, while well-intentioned, hides the real information behind the decorations. Sometimes you can get a simple text file. If you’re lucky, you might even find the information in some more structured format such as a spreadsheet or XML file. Without any unifying model and with much of the data intended as downloads rather than as an information service, this is really still Tim’s first star of open data — even though some of the data is provided as spreadsheets or open data formats. It’s a good start, but there’s an awful long way to go.

So let’s imagine that a better solution is desired, providing information services, but keeping it all modest by using off-the-shelf technology that everyone is familiar with. Imagine that someone with the authority to do so, takes the initiative to mandate that henceforth, all government data will be produced as Excel spreadsheets. Every memo, report, regulation, piece of legislation, form that citizens fill out, and even the U.S. Code will be kept in Excel spreadsheets. Yes, you need to suspend disbelief to imagine this — the complications that would result would be incredibly tough to solve. But, imagine that all those hurdles were magically overcome.

What would it mean if all government information was stored as spreadsheets? What would be possible if all that information was available throughout the government in predictable and permanent locations? Let’s call the system that would result the Government Information Storehouse – a giant information repository for information regularized as Excel spreadsheets. (BTW, this would be the future of government publishing once paper and PDFs have become relics of the past.)

How would this information look? Think about a piece of legislation, for instance. Each section of the bill might be modeled as a single row in the spreadsheet. Every provision in that section would be it’s own spreadsheet cell (ignoring hierarchical considerations, etc.) Citations would turn into cell references or cell range references. Amending formulas, such as “Section 1234 of Title 10 is amended by…” could be expressed as a literal formula — a spreadsheet formula. It would refer to the specific cell in the appropriate U.S. Code Title and contain programmatic instructions for how to perform the amendment. In short, lots of once complex operations could be automated very efficiently and very precisely. Having the power to turn all government information into a giant spreadsheet has a certain appeal — even if it requires quite a stretch of the imagination.

Now imagine what it would mean if selected parts of this information were available to the public as these spreadsheets – in a regularized and permanent way — say Data.gov 2.0 or perhaps, more accurately, as Info.gov. Think of all the spreadsheet applications that would be built to tease out knowledge from the information that the government is providing through their information portal. Having the ability to programmatically monitor the government without having to resort to complex measures to extract the information would truly enable transparency.

At this point, while the linkages and information services give us some of the attributes of Tim’s four and five star open data solutions, but our focus on spreadsheet technology has left us with a less than desirable two star system. Besides, we all know that having the government publish everything as Excel spreadsheets is absurd. Not everything fits conveniently into a spreadsheet table to say nothing of the scalability problems that would result. I wouldn’t even want to try putting Title 42 of the U.S. Code into an Excel spreadsheet. So how do we really go about achieving this sort of open data and the efficiencies it enables — both inside and outside of government?

In order to realize true four and five star solutions, we need to quickly move on to fulfilling all the parts of Tim’s five star chart. In his chart, a three star solution replaces Excel spreadsheets with an open data format such as a comma separated file. I don’t actually care for this ordering because it sacrifices much to achieve the goal of having neutral file formats — so lets move on to full four and five star solutions. To get there, we need to become proficient in the open standards that exist and we must strive to create ones where they’re missing. That’s why we work so hard on the OASIS efforts to develop Akoma Ntoso and citations into standards for legal documents. And when we start producing real information services, we must ensure that the linkages in the information (those links and formulas I wrote about earlier), exist to the best extent possible. It shouldn’t be up to the consumer to figure out how a provision in a bill relates to a line item in some budget somewhere else — that linkage should be established from the get-go.

We’re working on a number of core pieces of technology to enable this vision and get to full five star open data. We integrating XML repositories and SQL databases into our architectures to give us the information storehouse I mentioned earlier. We’re building resolver technology that allows us to create and manage permanent linkages. These linkages can be as simple as citation references or as complex as instructions to extract from or make modifications to other information sources. Think of our resolver technology as akin to the engine in Excel than handles cell or range references, arithmetic formulas, and database lookups. And finally, we’re building editors that will resemble word processors in usage, but will allow complex sets of information to be authored and later modified. These editors will have many of the sophisticated capabilities such as track changes that you might see in a modern word processor, but underneath you will find a complex structured model rather than the ad hoc data structures of a word processor.

Building truly open data is going to be a challenging but exciting journey. The solutions that are in place today are a very primitive first step. Many new standards and technologies still need to be developed. But, we’re well on our way.

HTML5, LegisPro Web, Standards, Track Changes, W3C

Building a browser-based XML Editor

Don’t forget the 2014 U.S. House Legislative Data and Transparency Conference this week.

I’m now hard at work on our second generation web-based XML editor. In my blog last week, I talked about the need for and complexities of change tracking in a legislative editor. In this blog, I want to describe more of the overall motivation.

A couple years ago, we built an HTML5-based legislative editor for Akoma Ntoso. We learned a lot from the effort and had some success with a couple customers whose needs matched the capabilities of the editor. The editor was built to use and exploit, to the fullest extent, many of the new APIs added to modern browsers to support HTML5. We found that, by focusing on HTML5, a lot of the complexities of dealing with browser quirks and incompatibilities were a thing of the past – allowing us to focus on building the editing functions.

The editor worked by transforming the XML document into a close representation of the XML, expressed as HTML5 tags. Using HTML5 features such as the @contenteditable attribute along with modern CSS, the browser DOM, selection ranges, drag and drop, and a WebDAV repository API, we were able to implement a fairly sophisticated web-based legislative editor.

But, not everything went smoothly. The first problem involved the complexity of mapping all the intricacies of XML into an HTML5 representation, and then maintaining that representation in the browser. Part of the difficulty stems from the fact the HTML5 is not specifically an XML dialect – and browsers tend to do HTML5 things that aren’t always XML friendly. The HTML5 DOM is deliberately rather loose and forgiving (it’s a big part of why HTML was successful in the first place) while XML demands a very precise and rigid DOM.

The second problem we faced was scalability. While the HTML5 representation wasn’t all that heavyweight, the bigger problem was the transformation cost going back and forth between HTML5 and XML. We sometimes deal with very large legislation and laws. In our bigger cases, the cost of transformation was simply unreasonable.

So what is the solution? Well, early last year we started experimenting with using a browser to render XML documents with a CSS directly – without any transform into HTML. Most modern browsers now do this very well. For the most part, we were able to achieve an acceptable rendition in the browser without any transformation.

There were a few drawbacks to this approach. For one, links were dead – they didn’t inherently do anything. Likewise, implementing something like the HTML @style attribute didn’t just naturally work. Before we could entertain the notion of a pure XML-based editor built within the XML infrastructure in the browser, we had to find a solution that would allow us to enrich the XML sufficiently to allow it to behave like an HTML page.

Another problem arose in that our prior web-based editor relied upon the @contenteditable feature of HTML. That is an HTML feature rather than a browser feature. Using XML as our base environment, we no longer had access to this facility. This wasn’t a total loss as our need for a rich change tracking environment required us to find a better approach that @contenteditable offered anyway.

With solutions to the major problems behind us, we started to take a look at the other goals for the editor:

Track Changes – This was the subject of my blog last week. For us, track changes is crucial in any editor targeted at legislation – and it must work at both the structural and textual level equally well. We use the feature for two things – redlining changes as is common in the U.S. and the automatic generation of amendment documents (amendments in context). Differencing can get you part way there – but it excludes the ability to adequately craft the changes in a way that deal with political sensitivities. Track Changes is a very complex feature which must be built into the very core of the editor – tacking it on later will be very difficult, if not impossible.
Scalability – Scalability is very important to our applications. We need to support very large documents. Even when we deal with document fragments, we need to allow those fragments to be very large. Our approach is to create editing islands within a large document loaded into the browser. This amounts to only building the editing superstructure around the parts of the document being edited rather than the whole document. It’s like building the scaffolding around only the floors being worked on in a skyscraper rather than trying to envelope the entire building in scaffolding.
Modularity – We’re building a number of very different applications currently – all of which require XML editing. To allow this variability, our new XML editor is written as a web-based component rather than a full-fledged application. Despite its complexity, on the surface it’s deceivingly simple. It has no user interface at all aside from the editing canvas. It’s completely driven by a well thought out JavaScript API. Adding the editor to a document is very simple. A single link, added to the bottom of the XML document, adds the editor to the document. With this component, we’re able to include it within all of the applications we are building.
Configurability – We need to support a number of different models – not just Akoma Ntoso. To achieve this, an XML-based configuration file is used to define the behaviors for any XML model. Elements can be defined as read-only, templates can be defined (or derived), and even the track changes behavior can be configured for individual elements. The sophistication being defined within the configuration files is to allow us to model all the variants of legislative models we have encountered without the need for extensive programming-level customization.
Browser Support – We’re pushing the envelope when it comes to browser support. Our current focus is on Google’s Chrome browser. Support for all the browsers aside from Internet Explorer should be relatively easy. Our experience has shown that the browsers are now quite similar. Internet Explorer is the one exception – in this particular area. Years ago, IE was the best browser when it came to XML support. While IE had many other compatibility issues, particularly with CSS, it led the way in supporting XML. However, while Microsoft has made tremendous strides moving forward to match the other browsers and modern standards, they’ve neglected XML. Their circa 1999 legacy capabilities for XML do no match modern standards and are quite deficient. Hopefully, this is something that will soon be rectified.

It’s not all smooth sailing. I have been finding a number of surprising issues with Google Chrome. For instance, whitespace management is a bit fudged at times. Chrome thinks nothing of adding the occasional non-breaking space to maintain whitespace when editing the DOM. What’s worse – it will inexplicably convert this into a text node that reads ” ” after a while. This is a character entity that is not defined in XML. I have to work hard to constantly reverse this odd behavior.

All in all, I’m excited by this new approach to building a web-based XML editor. It’s a substantial increase in sophistication over our prior web-based XML editor. This editor will be far more robust, scalable, and configurable in comparison to our prior editor and other editors we have worked on. While we still have a way to go in our development, we’ve found solutions to all the risky issues. It’s a future-looking approach – support can only get better. It doesn’t rely on compatibility modes or any other remnants of prior eras in web technology. This approach is really working out quite nicely for us.

Akoma Ntoso, HTML5, LegisPro Web, Standards, Track Changes, Transparency, Uncategorized, W3C

Legal Citations and XML Editing for Legislation

It’s been quite some time since my last blog post – almost six months. The reason is that I’ve been very busy. We are doing a lot of exciting development within Xcential. We are developing a number of quite challenging projects around the globe.

If you’ve been following my blog, you may remember that I was working on an HTML5-based XML editor. That development was two years ago now. We’ve come a long way since then. The basic editor has been stripped down, componentized, and has being rebuilt to be a far more robust, scalable, and adaptable solution. There are more details below, which I will expand upon as the editor rolls out over the next year.

Legal Citations

It was almost a year ago since the last Legislative Data and Transparency Conference in Washington D.C. (The next one is coming up) At that time, I spoke about the need for improved citation management in published XML documents. Well, we’ve come a long way since then. Earlier this year a Technical Committee was formed within OASIS to begin developing some standards. The Legal Citation Markup Technical Committee is now hard at work defining markup models for legal citations. I am a member of that TC.

The reference management part of our HTML5-based editor has been separated out as a separate project – as a citation interpreter and reference resolver. In our development tests, it’s integrated with eXist as a local repository. We also source documents from external sources such as LII.

We now have a few citation management projects underway, using our resolver technology. These are exciting projects which will be a huge step forward in improving how citations are managed. It’s premature to talk about this in any detail, so I’ll just leave this as a teaser of stuff to come.

XML Editing for Legislation

The OASIS Legal Document ML Technical Committee is getting ready to make a large announcement. While this progress is being made, at Xcential we’ve been hard at work refining the state-of-the-art in XML editing.

If you recall the HTML5-based editor for Akoma Ntoso from a couple of years back, you may remember that is was based around all the new HTML5 technologies that have recently been incorporated into web browsers. We learned a lot from that effort – both good and bad. While we were able to get a reasonable tagging editor, using facilities that made editing far easier, we still faced difficulties when it came to basic XML editing and scalability.

So, we’ve taken a more ambitious approach to produce a very generalized XML editing platform. Using what we learned as the basis, our new editor is far more capable. Rather than relying on the mapping of XML into an equivalent HTML5 structure, we now directly use the XML facilities that are built into the browser. This approach is both far more robust and far more scalable. But the most exciting aspect is change tracking. We’re building change tracking directly into the basic editing engine – from the outset. This means that we can track all changes – whether the changes are in the text or in the structure. With all browsers now correctly implementing the standardized DOM Range model, our change tracking model has to be very sophisticated. While it’s hellishly complex, my experience in implementing change tracking technologies over many years is really coming in handy.

If you’ve used change tracking in XMetaL, you know the limitations of their technology. XMetaL’s range selection constrains how you can select which limits the flexibility of deletion. This simplifies the problem for the XMetaL customizer, but at a serious usability price. It’s one of the biggest limiting factors of XMetaL. We’re dealing with this problem once and for all with our new approach – providing a great way to implement legislative redlining.

Redlining Take a look at the totally contrived example on the left. It’s admittedly not a real example, it comes from my stress testing of the change tracking facilities. But look at what it does. The red text is a complex deletion that spans elements with little regard to the structure. In our editor, this was done with a single “delete” operation. Try and do this with XMetaL – it takes many operations and is a real struggle – even with change tracking turned off. In fact, even Microsoft Word’s handling of this is less than satisfactory, especially in more recent versions. Behind the scenes, the editor is using the model, derived from the schema, to control this deletion process to ensure that a valid document is the result.

If you’re particularly familiar with XMetal, you will notice something else too. That deletion cuts through the structure of a table!!!! XMetaL can only track changes within the text of table cells, not the structure. We’re making great strides towards proper legislative redlining technologies, and we are excited to work with our partners and clients to put them into practice.

Akoma Ntoso, Hackathon, HTML5, LegisPro Web, Standards, Transparency, W3C

Web-Based XML Legislative Editor Update

It’s been quite a while since I gave an update on our web-based XML legislative editor – LegisPro^web. But that doesn’t mean that nothing has been going on. Quite the contrary, this has been a busy year for the editor project.

Let me first recap what the editor is. It’s an XML editor, written entirely around HTML5 technologies. It was first developed last year as the centerpiece to a Hackathon that Ari Hershowitz and I staged in San Francisco and around the world. While it is designed as a general purpose XML editor and can be configured to model any XML schema, it’s primarily configured to support Akoma Ntoso.

Since then, there has been a lot of continuing interest in the editor. If you attended the 2013 Legislative Data and Transparency Conference this past May in Washington DC, you may have noticed Jim Harper of the Cato Institute demonstrating their “Deepbills” project. The editor you saw is a heavily customized early version of LegisPro^web, reconfigured to handle the XML format that the US Congress publishes legislation in.

And that’s not the only place where LegisPro^web has been adopted. We’re in the finishing stages of a somewhat larger implementation we did for Chile. This is an Akoma Ntoso implementation – focused on debates and debate reports rather than on legislation. One interesting point worth noting – this implementation is done in Spanish. LegisPro^web is quite easily localized.

The common thread between these two implementations in the use case – they’re both implementations focused on tagging metadata within pre-existing documents rather than on creating new documents from scratch. This was the focus of the Hackathon we staged back in 2012 – little did we know how much of a market would exist for an editor focused on annotation rather than document creation. And there’s more to still come – we’ve been quite surprised in the level of interest in this particular use-case.

Of course, we’re not satisfied with an editor that can only annotate existing documents. We’ve been hard at work turning the editor into a full-featured legislative editor that works equally well at creating new documents as it does at annotating existing documents. In addition, we’ve made the editor very customizalble as well as adding capabilities to manage the comments and discussions that might revolve around a document as it is being created and annotated.

Most recently, the editor has been upgraded to the latest version of Akoma Ntoso coming out of the OASIS legal document ML technical committee where I am an active member. Along with that effort, the validator has been separated to run as a standalone Akoma Ntoso validator. I talked about that in my blog last week. I’m busy using the validator as I work frantically to complete an Akoma Ntoso project I am working on this week. I’ll talk some more about this project next week.

So where do we go from here? Well, the first big effort is to modularize the technologies found within the editor. We now have a diverse set of customers right now and they can all benefit from the various bits and pieces that make up LegisPro^web. By modularizing the pieces, we’ll be able to pick and choose which parts we use when and how. Separating out the validator was the first step. We’ll also be pulling out the reference resolver, attaching it to a native XML database, and partitioning out the client-side to allow the editing component to be used without the full editing environment offered by LegisPro^web.

One challenge that remains is handling redlining – managing insertions and deletions. This is a very difficult subject – and one I tackled in the work I did implementing the XML editor used by the California legislature. I took a very different approach in trying to solve the problem with LegisPro^web, but I’m not happy with the result. So, I’ll be returning to the proven approach we used way back when we built the original LegisPro editor on XMetaL.

As you can tell, we’ve got our work for the next year cut out for us.

Akoma Ntoso, LegisPro Web, Standards, Transparency, W3C

Free Akoma Ntoso Validator

How are people doing with the Library of Congress’ Akoma Ntoso Challenge? Hopefully, you’re making good progress, having fun doing it, and in so doing, learning a valuable new skill with this important emerging technology.

I decided to make it easy for someone without an XML Editor to validate their Akoma Ntoso documents for free. We all know how expensive XML Editors tend to be. If you’re like me, you’ve used up all the free trials you could get. I’ve separated the validation part of our LegisPro^web editor from the editing base to allow it to be used as a standalone validator. Now, all you need to do is either provide a URL to your document or, even easier, drop the text into the text area provided and then click on the “Validate” button. You don’t even need to go find a copy of the Akoma Ntoso schema or figure out how to hook it up to your document – I do all that for you.

To use the validator, simply draft your Akoma Ntoso XML document, specifying the appropriate namespace using the @xmlns namespace declaration, and then paste a copy into the validator. I’ll go and find the schema and then validate your document for you. The validation results will be shown to you conveniently inline within your XML source to help you in making fixes. Don’t worry, we don’t record anything when you use the validator – it’s completely anonymous and we keep no record of your document.

You can validate either the 2.0 version of Akoma Ntoso or the latest 3.0 version which reflects the work of the OASIS LegalDocumentML committee. Actually, there are quite a few other formats that the validator also will work with innately and, by using xsi:schemaLocation, you can point to any XML schema you wish.

Give the free Akoma Ntoso XML Validator a try. You can access it here. Please send me any feedback you might have.

Input Form

Validation Results

Akoma Ntoso, HTML5, Standards, Transparency, W3C

XML, HTML, JSON – Choosing the Right Format for Legislative Text

I find I’m often talking about an information model and XML as if they’re the same thing. However, there is no reason to tie these two things together as one. Instead, we should look at the information model in terms of the information it represents and let the manner in which we express that information be a separate concern. In the last few weeks I have found myself discussing alternative forms of representing legislative information with three people – chatting with Eric Mill at the Sunlight Foundation about HTML microformats (look for a blog from him on this topic soon), Daniel Bennett regarding microdata, and Ari Hershowitz regarding JSON.

I thought I would try and open up a discussion on this topic by shedding some light on it. If we can strip away the discussion of the information model and instead focus on the representation, perhaps we can agree on which formats are better for which applications. Is a format a good storage format, a good transport format, a good analysis/programming format, or a good all-around format?

1) XML:

I’ll start with a simple example of a bill section using Akoma Ntoso:

<section xmlns="http://docs.oasis-open.org/legaldocml/ns/akn/3.0/CSD03" 
       id="{GUID}" evolvingId="s1">
    <num>§1.</num>
    <heading>Commencement </heading>
    <content> <p>This act will go into effect on 
       <date name=”effectiveDate” date="2013-01-01">January 1, 2013</date&gt;. 
    </p> </content>
</section>

Of course, I am partial to XML. It’s a good all-around format. It’s clear, concise, and well supported. It works well as a good storage format, a good transport format, as well as being a good format of analysis and other uses. But it does bring with it a lot of complexity that is quite unnecessary for many uses.

2) HTML as Plain Text

For developers looking to parse out legislative text, plain text embedded in HTML using a <pre> element has long been the most useful format.

   <pre>
   §1. Commencement
   This act will go into effect on January 1, 2013.
   </pre>

It is a simple and flexible represenation. Even when an HTML represenation is provided that is more highly decorated, I have always invariably removed the decorations to leave behind this format.

However, in recent years, as governments open up their internal XML formats as part of their transparency intiatives, it’s becoming less necessary to write your own parsers. Still, raw text is a very useful base format.

3) HTML/HTML5 using microformats:

<div class="section" id="{GUID}" data-evolvingId="s1">
   <div>
      <span class="num">§1.</span> 
      <span class=”heading”>Commencement </span>
   </div>
   <div class="content"><p>This act will go into effect on 
   <time name="effectiveDate" datetime="2013-01-01">January 1, 2013 <time>. 
   </p></div>
</div>

As you can see, using HTML with microformats is a simple way of mapping XML into HTML. Currently, many legislative data sources that offer HTML content either offer bill text as plain text as I showed in the previous example or they decorate it in a way that masks much of the semantic meaning. This is largely because web developers are building the output to an appearance specification rather than to an information specification. The result is class names that better describe the appearance of the text than the underlying semantics. Using microformats preserves much of the semantic meaning through the use of the class attribute and other key attributes.

I personally think that using HTML with microformats is a good way to transport legislative data to consumers that don’t need the full capabilities of the XML representation and are more interested in presenting the data rather than analyzing or processing it. A simple transform could be used to take the stored XML and to then translate it into this form for delivery to a requestor seeking an easy-to-consume solution.

[Note: HTML5 now offers a <section> element as well as an <article> element. However, they’re not a perfect match to the legislative semantics of a section and an article so I prefer not to use them.]

4) HTML5 Microdata:

<div itemscope 
      itemtype="http://docs.oasis-open.org/legaldocml/ns/akn/3.0/CSD03#section" 
      itemid="urn:xcential:guid:{GUID}">
   <data itemprop="evolvingId" value="s1"/>
   <div>
      <span itemprop="num">§1.</span>
      <span itemprop="heading">Commencement </span>
   </div>
   <div itemprop="content"> <p>This act will go into effect on 
      <time itemprop="effectiveDate" time="2013-01-01">January 1, 2013 </time>.
   </p> </div>
</div>

Using microdata, we see more formalization of the annotation convention than microformats offers – which brings along additional complexity and requires some sort of naming authority which I can’t say I either really understand or see how it will happen. But it’s a more formalized approach and is part of the HTML5 umbrella. I doubt that microdata is a good way to represetn a full document. Rather, I see microdata better fitting in to the role of annotating specific parts of a document with metadata. Much like microformats, microdata is a good solution as a transport format to a consumer not interested in dealing with the full XML representation. The result is a format that is rich in semantic information and is also easily rendered to the user. However, it strikes me that the effort to more robustly handle namespaces only reinvents one of XMLs more confusing aspects, namely namespaces, in just a different way.

5) JSON

{
   "type": "http://docs.oasis-open.org/legaldocml/ns/akn/3.0/CSD03#section",
   "id": "{GUID}",
   "evolvindId": "s1",
    "num" : {
      "type": "http://docs.oasis-open.org/legaldocml/ns/akn/3.0/CSD03#num",
      "text": "§1."
   },
   "heading":  {
      "type": "http://docs.oasis-open.org/legaldocml/ns/akn/3.0/CSD03#heading",
      "text": "Commencement"
   },
   "content": {
      "type": "http://docs.oasis-open.org/legaldocml/ns/akn/3.0/CSD03#content",
      "text1": "This act will go into effect on "
      "date": {
         "type": "http://docs.oasis-open.org/legaldocml/ns/akn/3.0/CSD03#date",
         "date": "2013-01-01",
         "text": "January 1, 2013"
      }
      "text2": "."
   }
}

Quite obviously, JSON is great if you’re looking to easily load the information into your programmatic data structures and aren’t looking to present the information as-is to the user. This is a programmatic format primarily. Representing the full document in JSON might be overkill. Perhaps the role of JSON is for key parts of extracted metadata than the full document.

There are still other formats I could have brought up like RDFa, but I think my point has been made. There are many different ways of representing the same legislative model – each with its own strength and weaknesses. Different consumers have different needs. While XML is a good all-around format, it also brings with it some degree of sophistication and complexity that many information consumers simply don’t need to tackle. It should be possible, as a consumer, to specify the form of the information that most closely fits my need and have the legislative data source deliver it to me in that format.

[Note: In Akoma Ntoso, the format is called the “manifestation.” and is specified as part of the referencing specification.]

What do you think?

Legis Info

Applying Legislative Technologies

Category Archives: W3C