Lawsuit, technology, Track Changes, Transparency

Lawsuit Update and a Tale about Bicycles

In the past couple weeks, the first two court rulings have come out concerning our battle with the Akin Gump law firm. Both rulings have been in our favor. The first ruling denied Akin Gump’s motion to dismiss, instead allowing four of the five claims in our countersuit – with the judge calling them “plausible”. The second ruling denied Akin Gump’s attempt at a preliminary injunction to stop us from responding to actions from the U.S. Patent Office. The judge found that “there is not a substantial likelihood that (Akin Gump) will prevail,.” She then added “(T)o preclude Xcential from moving forward…would discourage invention, it would discourage innovation, it would discourage companies from investing their own resources to try to come up with workable solutions to commonly identified problems,.”

After reading the transcript, I have an interesting analogy to make. It is based on an analogy that the Akin Gump attorney chose to use – comparing our dispute to that of inventing a bicycle. While it’s not a perfect analogy, it is still a very good analogy.

Imagine you have an idea for a two-wheeled mode of transportation that will help you do your job more effectively. You discover that this idea is called a bicycle and that there are several bicycle manufacturers already – so you approach one to see if their product can do the job. While this company did not invent the bicycle, they specialize in making them and have been doing so for many years. This company only make bicycles. They are not the manufacturer of the commodity materials that make up a bicycle such as the metal tubing or even the tools that bend the tubes to make handlebars. While not a household name, they are very well known among bicycle enthusiasts around the world.

However, you discover that there is a problem. In the highly regulated world of bicycles (maybe automobiles would have been a better analogy), this bicycle maker doesn’t have a bicycle that conforms to your local regulatory market. You have a quick 30-minute call with the bicycle maker, and they say they’re familiar with the regulations in your market, but that making the changes your local regulations require and getting those changes certified is a costly business and is only done for customers willing to help foot the bill. You indicate that you’re still interested and accept their suggestion that they build a prototype, at no charge, of the bicycle that is suitable for your regulatory market. You send them a hand drawn map of the routes you want to ride the bicycle on so that they can understand the regulatory concerns that might apply.

With the suggestion of funding, the bicycle maker goes off starts building a demonstrable prototype of a modified bicycle model that conforms to your local regulatory requirements. As a potential customer for this localized version of the bicycle, you get limited updates from the salesperson you were working with to ensure that you’re still interested and to indicate that the prototype hasn’t been forgotten. He makes a point of buttering you up, as salespeople are wont to do. You never interact with the engineers building the prototype at all.

It turns out that the regulations require mudguards over the wheels and reflectors on the front and rear. In the process of attaching these parts, the engineers at the bicycle company come up with some nifty brackets for attaching the mudguards and reflectors to the bicycle frame in accordance with the regulatory requirements. While the mudguards and reflectors are commodities, the brackets used to attach them are novel, so the engineers apply for patent protection for these brackets. You play no role in the design or manufacture of these brackets.

When the bicycle maker brings their modified bicycle to your office to show you what can be done, you’re wowed by the result, but don’t really have the budget to help cover the cost of getting the changes certified. The bicycle company shelves the project without a paying customer.

Later on, while pondering whether to patent your idea for a bicycle, you come across the bicycle maker’s patent application. Without much of an understanding into the making of bicycles, you conflate the general idea for a bicycle (it was invented decades earlier and is prior art at this point), a bicycle that is adapted to your regulatory market (not something that is patentable), and the nifty brackets that hold the reflectors and mudguards on the frame and are necessary to achieve regulatory compliance in your local market (and that you played no part in designing or manufacturing – but are patentable).

Standard
technology, Track Changes, Transparency, Uncategorized

Xcential in Litigation with Akin Gump Law Firm

As many of you know by now, Xcential and I have been placed in an unfortunate position of having to deal with litigation with a large Washington D.C. lobbying firm, Akin Gump. We have been getting a lot of press, and I feel it is my duty to explain the situation as best I can.

Back in late 2018, Xcential was approached by an attorney at Akin Gump interested in applying our bill drafting and amending application, LegisPro, to improve the process of drafting and amending federal legislation. Initially, he was interested in using LegisPro to generate a bill amendment. This eventually evolved into an investigation into whether LegisPro could generate a federal amending bill from an in-context marked up copy of the law itself. The Akin Gump attorney expressed frustration at having to type, in narrative format, the proposed changes to a federal bill, and sought a simpler solution.

Amending law in context was a use case for LegisPro’s amendment generating capabilities that we had long anticipated. I had even written a blog exploring the idea of amendments-in-context in 2013 which got a lot of coverage on Chris Dorobeck’s podcast and at govloop. I have long made a habit of reporting on the developments in the Legal informatics industry and the work I do at my Legix.info blog including this blog post from April 2018 which describes LegisPro’s feature set at that time.

First a little explanation about how LegisPro is configured to work. As there is no universal way to draft and amend legislation, we must build a custom document model that configures LegisPro for each jurisdiction we work with. Legislation, particularly at the federal level, is complex so this is a substantial task. The customer usually pays for this effort. For the federal government, we did have document models for parts of LegisPro, but they were specific to different use cases and belonged to the federal government, not Xcential. We were not entitled to use these configurations, or any part of them, outside the federal government. This means that, out-of-the-box, LegisPro was not tailored for federal legislation in a way that we could share with Akin Gump. We would have to build a new custom document model to configure LegisPro to draft federal legislation for them.

Once the attorney had had an opportunity to try out a trial version of LegisPro using an account we had provided, we had a meeting during May of 2019. To provide clarity to the conversation as our terminology and his were different, I introduced the terms amending-in-full, cut-and-bite amendments, and amendments-in-context to the Akin Gump attorney. This is the terminology we use in the legal informatics industry to describe these concepts. Seeing the attorney’s enthusiasm towards addressing the problem and being convinced this was a true sales opportunity, I said I would find the time to build a small proof of concept to show how LegisPro’s existing cut-and-bite amendment generator would be configured to generate federal style amendments. I had previously arranged to have a partial conversion of a part of the U.S. Code done by a contractor to support Akin Gump’s trial usage of LegisPro. In the months that followed and using this U.S. Code data set, I set about configuring LegisPro to the task, much of it on my own personal time. Akin Gump did not cover the cost of any of this including my time or the contractor’s fees.

I flew to Washington D.C. for an August 29th, 2019 meeting in the attorney’s office to deliver the demonstration of a working application in person. He was duly impressed and kept exclaiming “Holy S###” over and over. We explained that this was just a proof of concept and that we could build out a complete system using a custom document model for federal legislation. He had already explained that the cost for a custom document model was probably out of reach for Akin Gump, so we should consider the implementation as an Xcential product rather than a custom application for Akin Gump. We considered this approach and our offer to implement a solution was a very small percentage of what the real cost would be. We would need to find many more customers on K Street to cover the development cost of a non-government federal document model. He had explained this approach would earn Xcential a “K Street parade,” a term we used to describe the potential project of building a federal document model to be sold on K Street

As it turns out, even our modest offer of about $1,000 to $2,000 per seat (depending on various choices) and a range of $50,000 to $175,000 for a custom document model and other services was too much for Akin Gump, and they walked away. Our 2019 pricing sheet clearly mentioned that the per seat prices did not include the cost for document conversion or configuration/customization and that a custom document model might be required for an additional fee. Furthermore, we had discussed the need for this custom work on several occasions. Our offer was more than fair as building these systems typically runs into the millions of dollars. Despite considerable costs to us in terms of my time, the use of a contractor, and travel to Washington D.C., we were never under any obligation to deliver anything to Akin Gump, and Akin Gump never paid us anything.

In the process of configuring LegisPro to generate federal amending bills, I came up with some implementation changes to the core product which we felt were novel. They built on a mechanism we had already built for a different project and for which we had separately applied for a patent a year earlier. We went ahead and filed for a patent for those changes, describing the overall processing model using a term I coined called “bill synthesis,” echoing my experience with logic synthesis earlier in my career from which I had drawn inspiration.

Two and a half years later, we learned that Akin Gump had filed a complaint to assume ownership of our patent application. This made no sense at all as our patent application is very implementation specific to LegisPro’s inner workings. The Akin Gump attorney involved had played no part in the design or coding of these details. What he had done was to describe his frustrations with the federal bill drafting use case to us – something we had long been aware of. He was simply asking us for a solution to a problem that was widely known in the industry and previously known to us.

When I got to see the assertions that Akin Gump makes in its court filings, I was astonished by their breadth. Rather than technical details, the assertions are all high-level ideas, insisting that an idea for an innovation the attorney had conceived of in the summer of 2018 is the “proverbial ‘holy grail’” to the legislative drafting industry. However, this innovative idea, as described in the complaint, is an application that closely resembles LegisPro as he would have experienced it during his trial usage in early 2019, including descriptions of key services and user interface features that have long existed.

What does not get any mention in Akin Gump’s filings are the details of our implementation for which we have based our patent claims — around document assembly and change sets. This is not surprising. The attorney is not a software developer and Akin Gump is not a software firm, it is a law firm. Neither he nor the firm have any qualifications in the realm of complex software development.

I can appreciate the attorney’s enthusiasm. When I came across the subject back in 2001, I was so enthusiastic that I started a company around it.

But for me, this is very hurtful. I worked hard, much of it on my own time over summer to build the proof of concept for Akin Gump. Yet I am portrayed in a most unflattering way. While I recall a cordial working arrangement throughout the effort, that is not how Akin Gump’s complaint reads. There is no appreciation for the complexity of writing or even configuring software to draft and amend legislation. The attorney forgets to mention that I succeeded in demonstrating a working application in the form of proof of concept on August 29th in his office. There is no understanding of the deep expertise that I brought to the table. The value of the time that I had spent on this project already was likely worth more than the amount we asked from Akin Gump to deliver a solution.

I wake up many nights angry at what this has come to. We work hard to make our customers happy and to do so at a very fair price. We are a small company, based in San Diego, and having to defend ourselves against the accusations of a wealthy law firm is a costly and frustrating undertaking that distracts from our mission.

What is ironic is that, by taking a litigation route to claim a patent, Akin Gump has all but closed off any likelihood of ever having the capability. There are few software firms in the world capable of creating such a system. In the U.S. I only know of a couple, none of which have the experience and products which Xcential can bring to bear. For this solution to ever see the light of day at the federal level, it is going to take a substantial effort, with many trusted parties working collaboratively.

I once had a boss who would often say “when all you have is a hammer, everything looks like a nail.” Just because you have something in your toolbox, you should choose wisely if using it is the right course of action. For a law firm, litigation is an easy tool to reach for. But is it the right tool?

Standard
Process, Transparency

Changing the way the world is governed. Together.

I’ve recently been marveling at how software development has changed in recent years. Our development processes are increasingly integrated with both our government customers and our commercial partners — using modern Agile methodologies. This largely fulfills a grand vision I was a part of very early in my career.

I started my career at the Boeing Company working on Internal Methods and Processes Development (IMPD). Very soon, the vision that came about was the idea of Concurrent Engineering where all aspects of the product development cycle, including all disciplines, all partners, and all customers, were tightly integrated in a harmonious flow of information. Of course, making the vision a reality at Boeing’s scale has taken some time. Early on, Boeing had great success on the B777 programme where the slogan was “Working Together“. A bit later, with the B787 programme where they went a few (or perhaps many) steps too far, they stumbled for a while. This was all Agile thinking — before there was anything called Agile.

Boeing’s concurrent engineering efforts quickly inspired one of Boeing’s primary CAD suppliers, Mentor Graphics. Mentor was hard at work on their second generation platform of software tools for designing electronic systems. Concurrent Engineering was a great customer-focused story to wrap around those efforts. Mentor’s perhaps arrogant tagline was “Changing the way the world designs. Together.” Inspired, I quickly joined Mentor Graphics as the product manager for data management. Soon I was to find that the magnitude of the development effort had actually turned the company sharply inward and the company had become anything but Agile. Mentor’s struggle to build a product line that marketed Concurrent Engineering became the very antithesis of the concept it touted. I eventually left Mentor Graphics in frustration and drifted away from process automation.

Now, two decades later, a remarkable thing has happened. All those concepts we struggled with way back when have finally come of age. It has become the way we naturally work — and it is something called Agile. Our development processes are increasingly integrated with both our customers and our partners around the world. Time zones, while still a nuisance, have become far less of a barrier than they once were. Our rapid development cycles are quite transparent, with our customers and partners having almost complete visibility into our repositories and databases. Tools and services like GitHub, AWS, Slack, JIRA, and Trello allow us to coordinate the development of products shared among our customers with bespoke layers built on top by ourselves and our partners.

ConcurrentEngineering.png

It’s always fashionable for political rhetoric to bash the inefficiencies of big government, but down in the trenches where real work gets done, it’s quite amazing to see how modern Agile techniques are being adopted by governments and the benefits that are being reaped.

As we at Xcential strive to become great, it’s important for us to look to the future with open eyes such that we can understand how to excel. In the new world, as walls have crumbled, global integration of people and processes has become the norm. To stay relevant, we must continue to adapt to these rapidly evolving methodologies.

Our vision is to change the way the world is governed — through the application of modern automation technology. One thing is very clear, we’re going to do it together with our customers and our partners all around the world. This is how you work in the modern era.

In my next blog post, I will delve a little more into how we have been applying Agile/Scrum and the lessons we have learned.

Standard
Process, Standards, Transparency, W3C

Connected Information

As a proponent of XML for legislation, I’m often asked why an XML approach is better than a more traditional approach using a word processor. The answer is simple – it’s all about connected information.

The digital end point in a legislative system can no longer be publication of PDFs. PDFs are nothing but a kludgy way to digitize paper — a way to preserve the old traditions and avoid the future. Try reading a PDF on a cell phone and you see the problem. Try clicking on a citation in a PDF and you see the problem. Try and scrape the information out of a PDF to make it computer readable and you see the problem. The only useful function that PDFs serve is as a bridge to the past.

The future is all about connected information — breaking the physical bounds of what we think of as a document and allowing the nuggets of information found within them to be connected, interrelated, and acted upon. This is the real reason why the future lies with XML and its related technologies.

In my blog last week I provided a brief glimpse into how our future amending tools will work. I explored how legislation could be managed similar to how software is managed with GitHub. This is an example of how useful connected information becomes. Rather than producing bills and amendments as paper documents, the information is stored in a way that it can be efficiently and accurately automated — and made available to the public in a computer readable way.

At Xcential, we’re building our new web-based authoring system — LegisPro. If you take a close look at it, you’ll see that it has two main components. Of course, there is a robust XML editor. However, at the system’s very heart is a linking system — something we call a resolver. It’s this resolver where the true power lies. It’s an HTTP-based system for managing all the linkages that exist in the system. It connects XML repositories, external data sources, and even SQL databases together to form a seamless universe of connected information.

We’re working hard to transform how legislation, and indeed, all government information is viewed. It’s not just about connecting laws and legislation together through simple web links. We talking about providing rich connections between all government information — tying financial data to laws and legislation, connecting regulatory information together, associating people, places, and things to government data, and on and on. We have barely started to scratch the surface, but it’s clear that the future lies with connected information.

While we today position LegisPro as a bill authoring system — it’s much more than that. It’s some of the fundamental underpinnings necessary for a system to transform government documents of today into the connected information of tomorrow.

Standard
Akoma Ntoso, HTML5, LegisPro Web, LEX Summer School, Standards, Transparency

Data Transparency Breakfast, LEX US Summer School 2015, First International Akoma Ntoso Conference, and LegisPro Edit reveal.

Last week was a very good week for my company, Xcential.

We started the week hosting a breakfast put on by the Data Transparency Coalition at the Booz Allen Hamilton facility in Washington D.C.. The topic was Transforming Law and Regulation. Unfortunately, an issue at home kept me away but I was able to make a brief pre-recorded presentation and my moderating role was played by Mark Stodder, our company President. Thank you, Mark!

Next up was the first U.S. edition of the LEX Summer School from Italy. I have attended this summer school every year since 2010 in Italy and it’s great to see the same opportunity for an open dialog amongst the legal informatics community finally come to the U.S. Monica Palmirani (@MonicaPalmirani), Fabio Vitali, and Luca Cervone (@lucacervone) put on the event from the University of Bologna. The teachers also included Jim Mangiafico  (@mangiafico) (the LoC data challenge winner), Veronique Parisse (@VeroParisse) from the European Union, Andrew Weber (@atweber) from the Library of Congress, Kirsten Gullickson (@GullicksonK) from the Office of the Clerk at the U.S. House of Representatives, and myself from Xcential. I flew in for an abbreviated visit covering the last two days of the Summer School where I covered how the U.S. Code is modeled in Akoma Ntoso and gave the students an opportunity to try out our new bill drafting editor — LegisProedit.

After the Summer School concluded, it was followed by the first International Akoma Ntoso Conference on Saturday, where I spoke about the architecture of our new editor as well as how the USLM schema is a derivative of the Akoma Ntoso schema. We had good turnout, from around the world, and a number of interesting speakers.

This week is NCSL in Seattle where we will be discussing our new editor with potential customers and partners. Mark Stodder from Xcential will be in attendance.

In a month, I’ll be in Ravenna once more for the European LEX Summer School — where I’ll be able to show even more progress towards the goal of a full product line of Akoma Ntoso tools. It’s interesting times for me.

The editor is coming along nicely and we’re beginning to firm up our QuickStarter beta plans. I’ve already received a number of requests and will be getting in touch with everyone as soon as we’re ready to roll out the program. If you would like to participate as a beta tester — or if you would just like more information, please contact us at info@xcential.com.

I’m really excited about how far we’ve come. Akoma Ntoso is on the verge of being certified as an official OASIS standard, our Akoma Ntoso products are coming into place, and interest around the world is growing. I can’t wait to see where we will be this time next year.

Standard
Akoma Ntoso, LEX Summer School, Process, Standards, Track Changes, Transparency, W3C

Achieving Five Star Open Data

A couple weeks ago, I was in Ravenna, Italy at the LEX Summer School and follow-on Developer’s Workshop. There, the topic of a semantic web came up a lot. Despite cooling in the popular press in recent years, I’m still a big believer in the idea. The problem with the semantic web is that few people actually get it. At this point, it’s such an abstract idea that people invariably jump to the closest analog available today and mistake it for that.

Tim Berners-Lee (@timberners_lee), the inventor of the web and a big proponent of linked data, has suggested a five star deployment scheme for achieving open data — and what ultimately will be a semantic web. His chart can be thought of as a roadmap for how to get there.

Take a look at today’s Data.gov website. Everybody knows the problem with it — it’s a pretty wrapper around a dumping ground of open data. There are thousands and thousands of data sets available on a wide range of interesting topics. But, there is no unifying data model behind all these data dumps. Sometimes you’re directed to another pretty website that, while well-intentioned, hides the real information behind the decorations. Sometimes you can get a simple text file. If you’re lucky, you might even find the information in some more structured format such as a spreadsheet or XML file. Without any unifying model and with much of the data intended as downloads rather than as an information service, this is really still Tim’s first star of open data — even though some of the data is provided as spreadsheets or open data formats. It’s a good start, but there’s an awful long way to go.

So let’s imagine that a better solution is desired, providing information services, but keeping it all modest by using off-the-shelf technology that everyone is familiar with. Imagine that someone with the authority to do so, takes the initiative to mandate that henceforth, all government data will be produced as Excel spreadsheets. Every memo, report, regulation, piece of legislation, form that citizens fill out, and even the U.S. Code will be kept in Excel spreadsheets. Yes, you need to suspend disbelief to imagine this — the complications that would result would be incredibly tough to solve. But, imagine that all those hurdles were magically overcome.

What would it mean if all government information was stored as spreadsheets? What would be possible if all that information was available throughout the government in predictable and permanent locations? Let’s call the system that would result the Government Information Storehouse – a giant information repository for information regularized as Excel spreadsheets. (BTW, this would be the future of government publishing once paper and PDFs have become relics of the past.)

How would this information look? Think about a piece of legislation, for instance. Each section of the bill might be modeled as a single row in the spreadsheet. Every provision in that section would be it’s own spreadsheet cell (ignoring hierarchical considerations, etc.) Citations would turn into cell references or cell range references. Amending formulas, such as “Section 1234 of Title 10 is amended by…” could be expressed as a literal formula — a spreadsheet formula. It would refer to the specific cell in the appropriate U.S. Code Title and contain programmatic instructions for how to perform the amendment. In short, lots of once complex operations could be automated very efficiently and very precisely. Having the power to turn all government information into a giant spreadsheet has a certain appeal — even if it requires quite a stretch of the imagination.

Now imagine what it would mean if selected parts of this information were available to the public as these spreadsheets – in a regularized and permanent way — say Data.gov 2.0 or perhaps, more accurately, as Info.gov. Think of all the spreadsheet applications that would be built to tease out knowledge from the information that the government is providing through their information portal. Having the ability to programmatically monitor the government without having to resort to complex measures to extract the information would truly enable transparency.

At this point, while the linkages and information services give us some of the attributes of Tim’s four and five star open data solutions, but our focus on spreadsheet technology has left us with a less than desirable two star system. Besides, we all know that having the government publish everything as Excel spreadsheets is absurd. Not everything fits conveniently into a spreadsheet table to say nothing of the scalability problems that would result. I wouldn’t even want to try putting Title 42 of the U.S. Code into an Excel spreadsheet. So how do we really go about achieving this sort of open data and the efficiencies it enables — both inside and outside of government?

In order to realize true four and five star solutions, we need to quickly move on to fulfilling all the parts of Tim’s five star chart. In his chart, a three star solution replaces Excel spreadsheets with an open data format such as a comma separated file. I don’t actually care for this ordering because it sacrifices much to achieve the goal of having neutral file formats — so lets move on to full four and five star solutions. To get there, we need to become proficient in the open standards that exist and we must strive to create ones where they’re missing. That’s why we work so hard on the OASIS efforts to develop Akoma Ntoso and citations into standards for legal documents. And when we start producing real information services, we must ensure that the linkages in the information (those links and formulas I wrote about earlier), exist to the best extent possible. It shouldn’t be up to the consumer to figure out how a provision in a bill relates to a line item in some budget somewhere else — that linkage should be established from the get-go.

We’re working on a number of core pieces of technology to enable this vision and get to full five star open data. We integrating XML repositories and SQL databases into our architectures to give us the information storehouse I mentioned earlier. We’re building resolver technology that allows us to create and manage permanent linkages. These linkages can be as simple as citation references or as complex as instructions to extract from or make modifications to other information sources. Think of our resolver technology as akin to the engine in Excel than handles cell or range references, arithmetic formulas, and database lookups. And finally, we’re building editors that will resemble word processors in usage, but will allow complex sets of information to be authored and later modified. These editors will have many of the sophisticated capabilities such as track changes that you might see in a modern word processor, but underneath you will find a complex structured model rather than the ad hoc data structures of a word processor.

Building truly open data is going to be a challenging but exciting journey. The solutions that are in place today are a very primitive first step. Many new standards and technologies still need to be developed. But, we’re well on our way.

Standard
Akoma Ntoso, Standards, Transparency

Look how far legal informatics has come – in just a few years

Back in 2001 when I started in the legal informatics field, it seemed we were all alone. Certainly, we weren’t – there were many similar efforts underway around the country and around the world. But, we felt alone. All the efforts were working in isolation – making similar decisions and learning similar lessons. This was the state of the field, for the most part, for the next 6 to 8 years. Lots of isolated progress, but few opportunities to share what we had learned and build on what others knew.

In 2010, I visited the LEX Summer School, put on by the University of Bologna in Ravenna, Italy. What became apparent to me was just how isolated the various pockets of innovation had become around the world. There was lots of progress, especially in Europe, but legal informatics, as an industry, was still in a fledgling state – it was more of an academic field than a commercial industry. In fact, outside of academic circles, the term legal informatics was all but meaningless. When I wrote my first blog in 2011, I looked forward to the day when the might be a true Legal Informatics industry.

Now, just a few years later, it’s stunning how far we have come. Certainly, we still have far to travel, but now we’re all working together towards common goals rather than working alone towards the same, but isolated, goals. I thought I would spend this week’s blog to review just how far we have come.

  1. Working together
    We have come together in a number of important dimensions:

    • First of all, consider geography. This is a small field, but around the world we’re now all very much connected together. We routinely meet, share ideas, share lessons, and share expertise – no matter which continent we work and reside on.
    • Secondly, consider our viewpoints. There was once a real tension between the transparency camp, government, external industry, and academia. If you participated at the 2014 Legislative Data and Transparency conference a few weeks ago in Washington D.C., one of the striking things was how little tension remains between these various viewpoints. We’re all now working towards a common set of goals.
  2. Technology
    • I remember when we used to question whether XML was the right technology. The alternatives were to use Open Office or Microsoft Office, basing the legislative framework around office productivity tools. Others even proposed using relational database technology along with a forms-based interface. Those ideas have now generally faded away – XML is the clear choice. And faking XML by relying on the fact that the Open Document Format (ODF) or Office Open XML formats are based on XML, just isn’t credible anymore. XML means more than just relying on an internal file format that your tools happen to use – it means designing information models specifically to solve the challenges of legal informatics.
    • I remember when we used to debate how references should be managed. Should we use file paths? Should we use URNs? Should we use URLs? Today the answer is clear – we’re all settling around logical URLs with resolvers, sometimes federated, to stitch together a web of interconnected references. Along with this decision has been the basic assumption that web-based solutions are the future – desktop applications no longer have a place in a modern solution.
    • Consider database technology. We used to have three choices – use the file system, try and adapt mature but ill-fitting relational databases, or take a risk with emerging XML databases. Clearly XML databases were the future – but was it too early? Not anymore! XML database technology, along with XQuery, have come a long way in the past few years
  3. Standards
    Standards are what will create an industry. Without them, there is little opportunity to re-use – a necessary part of allowing cost-effective products to be built. After a few false starts over the years, we’re now on the cusp of having industry standards to work with . The OASIS LegalDocML (Akoma Ntoso) and LegalCiteM technical committees are hard at work on developing those standards. Certainly, it will be a number of years before we will see the all the benefits of these standards, but as they come to fruition, a real industry can emerge.
  4. Driving Forces
    Ten years ago, the motivation for using XML was to replace outdated drafting systems, often cobbled together on obsolete mainframes, that sorely needed replacement. The needs were all internal. Now, that has all changed. The end result is no longer a paper document which can be ordered from the “Bill Room” found in the basement of the Capitol building. It’s often not even a PDF rendition of that document. The new end result is information which needs to be shared in a timely and open way in order to achieve the modern transparency objectives, like the DATA Act, that have been mandated. This change in expectations is going to revolutionize how the public works with their representatives to ensure fair and open government.
  5. In the past dozen years, things sure have changed. Credit must be given to the Monica Palmirani (@MonicaPalmirani) and Fabio Vitali at the University of Bologna – an awful lot of the progress pivots around their initiatives. However, we’ve all played a part in creating an open, creative, and cooperative environment for Legal Informatics to thrive as more than just an academic curiosity – as a true industry with many participants working collaboratively and competitively to innovate and solve the challenges ahead.

Standard
Process, Standards, Transparency

Imagining Government Data in the 21st Century

After the 2014 Legislative Data and Transparency conference, I came away both encouraged and a little worried. I’m encouraged by the vast amount of progress we have seen in the past year, but at the same time a little concerned by how disjointed some of the initiatives seem to be. I would rather see new mandates forcing existing systems to be rethought rather than causing additional systems to be created – which can get very costly over time. But, it’s all still the Wild Wild West of computing.

What I want to do with my blog this week is try and define what I believe transparency is all about:

  1. The data must be available. First and foremost, the most important thing is that the data be provided at the very least – somehow, anyhow.
  2. The data must be provided in such a way that it is accessible and understandable by the widest possible audience. This means providing data formats that can be read by ubiquitous tools and, ensuring the coding necessary to support all types of readers including those with disabilities.
  3. The data must be provided in such a way that it should be easy for a computer to digest and analyze. This means using data formats that are easily parsed by a computer (not PDF, please!!!) and using data models that are comprehensible to widest possible audience of data analysts. Data formats that are difficult to parse or complex to understand should be discouraged. A transparent data format should not limit the visibility of the data to only those with very specialized tools or expertise.
  4. The data provided must be useful. This means that the most important characteristics of the data must be described in ways that allow it to be interpreted by a computer without too much work. For instance, important entities described by the data should be marked in ways that are easily found and characterized – preferably using broadly accepted open standards.
  5. The data must be easy to find. This means that the location at which data resides should be predictable, understandable, permanent, and reliable. It should reflect the nature of the data rather than the implementation of the website serving the data. URLs should be designed rather than simply fallout from the implementation.
  6. The data should be as raw as possible – but still comprehensible. This means that the data should have undergone as little processing as possible. The more that data is transformed, interpreted, or rearranged, the less like the original data it becomes. Processing data invariably damages its integrity – whether intentional or unintentional. There will always be some degree of healthy mistrust in data that has been over-processed.
  7. The data should be interactive. This means that it should be possible to search the data at its source – through both simple text search and more sophisticated data queries. It also means that whenever data is published, there should be an opportunity for the consumer to respond back – be it simple feedback, a formal request for change, or some other type of two way interaction.

How can this all be achieved for legislative data? This is the problem we are working to solve. We’re taking a holistic approach by designing data models that are both easy to understand and can be applied throughout the data life cycle. We’re striving to limit data transformations by designing our data models to present data in ways that are both understandable to humans and computers alike. We are defining URL schemes that are well thought out and could last for as long as URLs are how we find data in the digital era. We’re defining database solutions that allow data to not only be downloaded, but also searched and queried in place. We’re building tools that will allow the data to not only be created but also interacted with later. And finally, we’re working with standards bodies such as the LegalDocML and LegalCiteM technical committees at OASIS to ensure well thought out world wide standards such as Akoma Ntoso.

Take a look at Title 1 of the U.S. Code. If you’re using a reasonably modern web browser, you will notice that this data is very readable and understandable – its meant to be read by a human. Right click with the mouse and view the source. This is the USLM format that was released a year ago. If you’re familiar with the structure of the U.S. Code and you’re reasonably XML savvy, you should feel at ease with the data format. It’s meant to be understandable to both humans and to computer programs trying to analyze it. The objective here is to provide a single simple data model that is used from initial drafting all the way through publishing and beyond. Rather than transforming the XML into PDF and HTML forms, the XML format can be rendered into a readable form using Cascading Style Sheets (CSS). Modern XML repositories such as eXist allow documents such as this to be queried as easily as you would query a table in a relational database – using a query language called XQuery.

This is what we are doing – within the umbrella of legislative data. It’s a start, but ultimately there is a need for a broader solution. My hope is that government agencies will be able to come together under a common vision for our information should be created, published, and disseminated – in order to fulfill their evolving transparency mandates efficiently. As government agencies replace old systems with new systems, they should design around a common open framework for transparent data rather building new systems in the exact same footprint as the old systems that they demolish. The digital era and transparency mandates that have come with it demand new thinking far different than the thinking of the paper era which is now drawing to a close. If this can be achieved, then true data transparency can be achieved.

Standard
Process, Transparency

What is Transparency?

I’ve been thinking a lot about transparency lately. The disappearance of Malaysian Airline Flight 370 (MH370) provided an interesting case to look at – and some important lessons. Releasing data which requires great expertise to decipher isn’t transparency.

My boss, when I worked on process research at the Boeing Company many years ago, used to drill into me the difference between information and data. To him, data was raw – and meaningless unless you knew how to interpret it. Information, on the other hand, had the meaning applied so you could understand it – information, to him, was meaningful.

Let’s recall some of the details of the MH370 incident. The plane disappeared without a trace – for reasons that remain a mystery. The only useful information, after radar contact was lost, was a series of pings received by Inmarsat’s satellite. Using some very clever mathematics involving Doppler shifts, Inmarsat was able to use that data to plot a course for the lost plane. That course was revealed to the world and the search progressed. However, when that course failed to turn up the missing plane, there were increasingly angry calls for more transparency from Inmarsat – to reveal the raw data. Inmarsat’s response was that they had released the information, in the form of a plotted course, to the public and to the appropriate authorities, However, they chose to withhold the underlying data, claiming it wouldn’t be useful. The demands persisted, primarily from the press and the victims’ families. Eventually Inmarsat gave in and agreed to release the data. With great excitement, the press reported this as “Breaking News”. Then, a bewildered look seemed to come across everyone and the story quickly faded away. Inmarsat had provided the transparency in the form it was demanded, releasing the raw data along with a brief overview and the relevant data highlighted, but it still wasn’t particularly useful. We’re still waiting to hear if anyone will ever be able to find any new insights into whatever happened to MH370 using this data. Most likely though, that story has run its course – you simply need Inmarsat’s expertise to understand the data.

There is an important lesson to be learned – for better or worse. Raw data can be released, but without the tools and expertise necessary to intepret it, it’s meaningless. Is that transparency? Alternatively, raw data can be interpreted into meaningful information, but that opens up questions as to the honesty and accuracy of the interpretation. Is that transparency? It’s very easy to hide the facts in plain sight – by delivering it in a convoluted and indecipherable data format or by selectively interpreting it to tell an incomplete story. How do we manage transparency to achieve the objective of providing the public with an open, honest, and useful view of government activities?

Next week, I want to describe my vision for how government information should be made public. I want to tackle the conflicting needs of providing information that is both unfiltered yet comprehensible. While I don’t have the answers, I do want to start the process of clarifying what better transparency is really going to achieve.

Standard
Akoma Ntoso, Standards, Transparency

2014 Legislative Data and Transparency Conference

Last week, I attended the 2014 Legislative Data and Transparency Conference in Washington D.C. This one day conference was put on the by the U.S. House of Representatives and was held at the U.S. Capitol.

The conference was quite gratifying for me. A good part of the presentation related to projects my company, Xcential, and I are working on. This included Ralph Seep’s talk on the new USML format for the U.S. Code and the Phase 2 Codification System, Sandy Strokof’s mention of Legislative Lookup and Linking and the new Amendment Impact Program (which was demonstrated by Harlan Yu (@harlanyu)), Daniel Bennett’s (@cititzencontact) talk on the legal citation technical committee, and finally Monica Palmirani (@MonicaPalmirani) and Fabio Vitali, from the University of Bologna, talked about Akoma Ntoso and Legal Document ML. We’ve made an awful lot of progress over the past few years.

Other updates included:

  • Andrew Weber (@atweber) gave an update on the progress being made by the Congress.gov website over at the Library of Congress. While still in beta, this site has now essentially replaced the older Thomas site.
  • There was an update on modernization plans over at the Government Printing Office including increased reliance on XML technologies. While it is good to see the improvements planned, their plans didn’t seem to be well integrated with all the other initiatives underway. Perhaps this will be clearer when more details are revealed.
  • Kirsten Gullickson (@GullicksonK) gave an update on the developments over at the Office of the Clerk of the House.

One very interesting talk was the winner of the Library of Congress Data Challenge, Jim Mangiafico (@mangiafico), describing his work building a transform from the existing Bill XML into Akoma Ntoso.

After lunch, there were a series of flash talks on various topics:

Later in the afternoon, Ali Ahmad (@aliahmad) chaired a panel discussion/update on the DATA Act and its implications on the legislative Branch. Perhaps it’s still to really understand what the effect of the DATA act will be – it looks like it is going to have a slow rollout over the next few years.

Anne Washington chaired a panel discussion on bringing the benefits of paper retrieval to electronic records. This discussion centered around a familiar theme, making data useful to someone who is searching for it – so it can be found, and when it is found, used.

All in all, it was a very useful day spent at the nation’s Capitol.

Standard