September 10, 2014

New Proposal Includes XBRL Exemption - and Major Setback for Open Data

UPDATE: On September 16, 2014, H.R. 5405 passed the House of Representatives by a vote of 320 to 102.

A major setback for open government data may be on the agenda for the U.S. House of Representatives.

Despite the opposition of the tech industry, Rep Robert Hurt's proposal to direct the Securities and Exchange Commission (SEC) to stop collecting financial data from most public companies has been included as part of a new legislative package--a new bill introduced on Monday, Sept. 8, by Rep. Mike Fitzpatrick and a number of other Republican members.
SEC reporting could remain as outdated as this photo of the Capitol.

The new bill, H.R. 5405, brings together ten previous bills into a single one. One of those ten is Rep. Hurt's previous proposal, included in the new bill verbatim. Judging from the urgency of the current House schedule, H.R. 5405 could see action by the House of Representatives as early as next week.

Nine out of the ten bills included in H.R. 5405 have already been approved, as stand-alone bills, by bipartisan majorities in either the Financial Services Committee or the full House. (The Financial Services Committee passed Rep. Hurt's original bill in March 2014.) So it seems clear that the backers of H.R. 5405 want to craft a bill that will pass the House easily, without serious opposition.

H.R. 5405's introduction conveys that the bill is non-controversial by stating three innocuous purposes:
To make technical corrections to the Dodd-Frank Wall Street Reform and Consumer Protection Act, to enhance the ability of small and emerging growth companies to access capital through public and private markets, to reduce regulatory burdens ...

But H.R. 5405, if approved by the House, introduced and passed in the Senate, and signed into law by President Obama, will dramatically restrict the availability of searchable corporate financial data to investors--and to the tech companies building investment tools.

Supporters of open data in financial regulatory reporting will remember that the SEC collects an open data version of each financial statement in the eXtensible Business Reporting Language (XBRL) structured data format, alongside the old-fashioned plain-text version, from every public company registered in the United States. Investors, markets, and the public can use the XBRL version of each financial statement to create a fully searchable data set of all U.S. public company databases. XBRL data supports free tools for investors like It is also used by infomediaries like Morningstar and Thomson Reuters to enrich the information they deliver to paying clients.

Rep. Hurt's proposal, now incorporated into H.R. 5405, would direct the SEC to exempt all public companies with revenues below $250 million--a majority of public companies--from the obligation to file an open data version. Supporters of the exemption claim that XBRL-formatted financial statements cost "tens of thousands of dollars" to create, but Financial Executive International found a median annual cost of $2,000 for small companies (page 19), and some providers offer XBRL preparation services at even lower prices.

Supporters of the exemption had one valid point last spring: at that time, the SEC had not taken any steps to ensure the quality of the XBRL filings. Without assurance that the open data versions of financial statements were reliable, investors were reluctant to use them, and relied on the plain-text versions instead. But last summer, after a year of advocacy from open data allies in Congress, the SEC took its first public steps toward enforcing better data quality. As quality improves, investors and the tech companies serving them will make more use of the open data financial statements.

The companies themselves will benefit, too. Open, structured data delivers information more efficiently to the markets, which makes it easier for smaller companies to find eager investors and brings down their capital costs.

H.R. 5405 would cut off such progress by forcing the SEC to use documents, not open data, to collect corporate financial information.

Fans of open data should make their opposition to this portion of H.R. 5405 known.

One way to do that is to contact your representative and contact House leadership.

Another way is to join our Coalition at Data Transparency 2014 later this month. Our open data policy conference will demonstrate broad public and industry support for the open data transformation--and prevent that transformation from being halted by measures like this one.

August 20, 2014

Guest post from Ari Hoffnung: Don't Lose Sleep Over $619 Billion

We're much obliged to Ari Hoffnung, Senior Adviser at Coalition member Socrata, Inc., for this blog post. Socrata helps public sector organizations improve transparency, citizen service, and data-driven decision-making. Ari is a national leader in promoting financial transparency and previously served as the New York City Deputy Comptroller for Budget & Public Affairs. He was also the driving force behind the award-winning Checkbook NYC website.

GAO: Approximate current condition of

Earlier this month a report ( from the Government Accountability Office found that the website was missing more money than the combined net worth of the Top 10 people featured on this year’s Forbes Billionaires list (

In the words of the GAO:

"Although agencies generally reported required contract information, they did not properly report information on assistance awards (e.g., grants or loans), totaling approximately $619 billion in fiscal year 2012."

On one hand, $619 billion is a lot of money to be missing by anyone’s standards, including Bill Gates, Warren Buffett, and the Koch brothers. On the other hand, there are several reasons why I would not loose sleep over the GAO’s findings.

First and foremost, this report is a good reminder that we live in a democracy strong enough to have an independent watchdog like the GAO (, with the freedom to investigate how the federal government spends taxpayer dollars. I'm not saying our democracy is perfect, but I feel fortunate to live in a country where the government's overall lack of financial transparency can be criticized by a government-funded entity. 

Second, the report reinforces the shortcomings of the current website in the areas of data consistency and completeness that have been identified by advocates like those from the Sunlight Foundation’s project.
Finally, the biggest silver lining in this report is that many of the problems cited by the GAO will be addressed by the implementation of the recently passed DATA Act (, which requires the standardization of spending data throughout the federal government. 

As someone who has been working on financial transparency for the lion share of the past five years, I think the federal government ought to consider following in the footsteps of the Big Apple and make payments data the first priority of a revamped website.

In New York City, when we first launched our Checkbook NYC financial transparency website ( that tracks our $70 billion+ annual budget; we decided to do so with payment data because while most taxpayers may not understand the ins and outs of government budgets, virtually all taxpayers understand what it means to make payments.

Think about it. Anyone with a checkbook, debit card, or even just a few bucks in their pocket understands what it means when cash leaves their account (or pocket) – whether it's paying the rent by check, using a debit card to buy groceries, or spending a few dollars to buy a cup of coffee. 

That’s why the federal government ought to begin the herculean task of standardizing spending data throughout the entire federal government and publishing it on a new and improved version of the website with the sliver of data best understood by Americans – payments.

July 14, 2014

The SEC took a small - but significant! - step toward better corporate financial data. Here's why, and what it means.

In 2009, the Securities and Exchange Commission (SEC) began requiring U.S. public companies to submit an open data version, encoded in the eXtensible Business Reporting Language (XBRL) format, of each quarterly financial statement. 

The SEC collects the same information twice: once as an old-fashioned document and again as open data. Making matters worse, the agency systematically reviews the document version for potential errors and issues but doesn't apply the same quality control to the XBRL version.

Open data could bring transformation to our capital markets. The use of XBRL could help investors make better decisions faster; allow the agency to use analytics to find and stop fraud; and permit companies to automate disclosure processes that used to be manual. Because open data is easier and cheaper to analyze than plain-text documents, analysts should be able to use XBRL financial statements expand their coverage, which means smaller companies should get more notice.
The SEC's atrium is transparent. Why isn't its data?

Unfortunately, because the SEC has not treated the open data version of each financial statement with the same care as the document version, all these benefits have remained theoretical.

As Calcbench and TagniFi have reported, the quality of the XBRL data set is so bad that investors and analysts have been reluctant to use it, which means there's not much of a market for the software tools needed for the transformation.

But last week, a change began. 

On Monday, the SEC's Division of Corporation Finance announced it had sent letters to certain companies whose XBRL financial statements had failed to include necessary data. The agency’s requirement for public companies to submit a structured data version of each financial statement, alongside the old-fashioned document version, for each financial quarter has been in place since 2009.

Coinciding with the SEC's action, four public companies announced corrections to previously-filed open data versions of their financial statements. In the previous five years since the start of open data reporting, only one company had ever amended an XBRL financial statement.

Why did the SEC take this step toward better data quality? 

Because, one year ago, Congress started questioning why the agency had been so slow to embrace open data. Members of both parties have kept up the scrutiny ever since.
  • On July 17, 2013, at the urging of Rep. Mike Quigley (D-IL), the House Appropriations Committee asked the SEC to explain its plan to improve investors' access to corporate disclosures through accessible formats.
  • On September 10, 2013, at our Data Transparency 2013 policy conference, Chairman Darrell Issa (R-CA) of the House Oversight Committee announced his committee had sent the SEC a letter asking the agency to re-start its stalled transformation from disconnected documents into open data.
  • On April 1, 2014, in an Appropriations Committee hearing, Rep. Quigley asked SEC chair Mary Jo White to explain the agency's failure to enforce open data quality.
  • After an April 29, 2014, hearing of the House Financial Services Committee, Rep. Keith Ellison (D-MN) submitted questions for the record seeking similar answers.
Our Coalition has been working with these Congressional open data supporters, and others, to keep the questions coming.

Our campaign isn't just about the financial statements currently being filed in XBRL. The agency collects hundreds of forms from public companies, financial firms, mutual funds, and other regulated entities. Most of these forms are still filed as documents, not as open data. The documents are hard for investors to understand, difficult for analysts to translate, and expensive for the agency's staff to use. They create compliance challenges for the regulated entities.

As the SEC's investor advisory committee recommended last year, the securities disclosure system needs total transformation.

Last week's action is a positive step, but it is only a first step. The agency must treat the open data version of each financial statement with the same care that it applies to the old-fashioned document version. Ultimately, we hope the SEC will eliminate the current duplication and collect a single submission from public companies - one that is both human-readable and machine-readable.

The SEC should also re-start its stalled transformation from documents to open data by adopting data standards for all the information it collects under the securities laws.

June 14, 2014

FindTheBest makes U.S. contract data accessible--and previews the transformation the DATA Act might bring

This guest post by Nina Quattrocchi, a senior product associate at FindTheBest, explains how FindTheBest adds value to currently-available U.S. federal spending data--and how the DATA Act could deliver more accurate, more complete information for FindTheBest to publish for citizens' use. FindTheBest is a Startup Member of the Data Transparency Coalition.

With the enactment of the DATA Act one month ago earlier this week, FindTheBest is anxious to see the transformation of federal spending data.

There have been previous pushes to bring transparency to government spending. In 2007, was launched in response to the Federal Funding Accountability and Transparency Act’s (FFATA) requirement to create a website with free and searchable information on all Federal awards.Though was a good start, it is difficult to navigate and challenging to make sense of. USASpending focuses on individual transactions which makes it impossible to look at an entire contract, grant, or loan since they often encapsulate multiple transactions. These shortcomings result from the government's failure, so far, to adopt consistent data standards to identify awards, recipients, and programs. Additionally, is incomplete. FFATA dealt with grant and contract data but ignored administrative spending, so the website doesn’t illustrate the full government spending lifecycle. simply doesn’t bring full clarity to federal spending, which is where the DATA Act comes in. The new law requires government-wide data standards to make the whole structure of federal spending fully searchable. It also expands the scope of spending transparency to include administrative spending as well as grants and contracts.

We're Starting Now

FindTheBest is eager to take advantage of the new data standards and broader scope to deliver more accurate, more detailed, and more complete federal spending information to Americans. But we're not waiting for the DATA Act to take effect to get started. Here's what's already happening.

FindTheBest is a research engine headquartered in Santa Barbara, California that gives people detailed information on 2,000 topics so they can research with confidence. We recently created a product that included profiles for the more than 30 million registered companies in the United States. As we were working on this project, we realized that the interactions between companies and the government was often unclear. We decided that we could use the FindTheBest platform and data aggregation technology to shed light on these relationships. Since this realization, I have been focused on developing a suite of content that revolves around government spending: how much the U.S. government is spending, what they’re buying, and who they’re buying from. So far, we’ve built Government Contracts, Government Contractors, Open Grants, and Contract Opportunities. We’re currently in the process of building Government Grants & Loans and Agency Spending.

The entire suite will be built from government data. Right now, we’re using data from,, and My ultimate goal is to detail the full lifecycle of government spending — from taxpayer dollars, Congressional appropriation, Treasury allocation and agency obligation to payout. This is impossible with the current data landscape, but the DATA Act will help by improving current data and making additional data available.

Digging into Government Data, currently the main government spending data resource, is well-intentioned, but it still fails to be a clean and publicly accessible source of data on government expenditures. A site like FindTheBest is needed to truly understand the information, but it’s not always easy to work with government data. There are three main issues that we’ve run into with government data, spending-related and otherwise:
  1. The data is messy. The database includes errant numbers or characters at the beginning and end of words and contractor names and cities are often cut short. We’ve done our best to clean up the data, but with 35 million transactions, it’s hard to catch every mistake. 
  2. A lot of the data is incorrect. We’ve found many contracts with incorrect transaction dates, which results in listings like this Federal Prison System contract that states a completion date of 5008, indicating that the transaction spans more than 3,000 years. Additionally, we’ve found that much of the pricing information on is out of date and incorrect. Data for current contract value and ultimate contract value are often neglected or misstated because they’re not used as often as the obligation amount to value the contract. In our government contracts topic, we make sure to explain the reason why the current or ultimate values are wrongfully stated as $0. 
  3. doesn’t include what we consider the most important data point — the outlay, which is the actual amount paid by the government to the contractor or grantee. This amount is crucial to government spending transparency. This data is collected but it’s not displayed on We filed a Freedom of Information Act (FOIA) request to obtain the information so we can add it to our government contract content but haven’t had any success.
There are some upsides to working with government data. Most importantly, it’s free. Additionally, while it may seem that I’m quick to criticize, working with their development team has been great. They are quick to respond to technological and data issues concerning their site. When I reported an error I found in their API, they fixed it the next day. 

Looking Ahead

Even with the constraints of working with limited and error-ridden government data, we’re developing a suite of government spending data that we’re proud of. We’ve worked hard to explain relevant data points, appropriately cite the source and flag examples where the data might contain errors. At the same time, we make sure all of our content is constantly being updated to provide users with access to the best information. The passage of the DATA Act will allow us to build even better, more accurate, and more complete government spending resources for our users. For now, we’ll continue molding,, and data into digestible content that allow anyone to make sense of government spending.

May 28, 2014

Let's Fix the SEC's Open Corporate Financial Data--Not Eliminate Most of It

UPDATE: On September 8, 2014, Rep. Mike Fitzpatrick introduced a bill that included the same language exempting most companies from submitting financial statements to the SEC in XBRL. Rep. Fitzpatrick's bill, H.R. 5405, is a package bringing together several changes to U.S. securities laws that previously received committee approval. Details: New Proposal Includes XBRL Exemption - and Major Setback for Open Data

Today the Coalition issued a letter calling on the House Financial Services Committee to direct the Securities and Exchange Commission (SEC) to fully enforce the quality of the financial disclosure data it collects, rather than eliminate open data reporting for most public companies. The joint letter follows the Committee's March 14 approval of H.R. 4164, which would exempt companies under $250 million in annual revenue from existing requirements to file financial statements as machine-readable open data.

The SEC doesn't apply quality control to corporate financial data.
H.R. 4164 is based on a flawed diagnosis that blames open data tools for the symptoms of poor data quality. Instead of decimating this critical data set, Congress should direct the SEC to stop accepting inaccurate submissions, so the data set becomes useful. Once the data set can be analyzed without costly corrections, analysts will be able to expand their coverage, and smaller public companies will get more attention. Our capital markets want quality data -- not less of it.

Under H.R 4164, the SEC would stop requiring financial reports formatted in the eXtensible Business Reporting Language (XBRL) from companies under $250 million in annual revenue, regressing to a document-based system. The exemption would remove about 60% of publicly traded companies from existing open data reporting requirements.

Today's letter asks the House Financial Services Committee to modify the proposal to direct the SEC to enforce data quality. If the SEC delivered more reliable data, companies would be able to benefit from expanded and more cost-effective coverage.

May 23, 2014

Enigma Uses Disparate Data to Decipher Corporate Relationships offers users a simple search feature.
Analysts, journalists, and citizens seeking to use government records to trace a company's activities face a daunting task.

Since the U.S. federal and state governments don't use common identifiers, researchers often expend considerable time and resources to identify information reported by the same company to different government agencies.

That's where Enigma comes in. Enigma, one of the Coalition's newest members, has pioneered a novel way to illuminate relationships between companies.

"We're trying to piece this puzzle together out of currently available bits and fragments," says Enigma's founder and CEO Marc DaCosta. "We have to operate in creative ways to bring these disparate data sets together to produce new insights."

Enigma scrapes a wide variety of federal and state level government websites to glean such fragments. Enigma also petitions for and buys additional information from agencies and commercial vendors. Once all those pieces of data are on its platform, Enigma applies its own algorithm to pull them together and link them to the same entity.

A recent New York Times profile exemplified how Enigma's platform is able to pull together disconnected data points to paint a clearer picture of a company:

Ask Enigma for facts about Lockheed Martin, for example, and here are some of the disparate details that surface: Last year, this military contractor entered into agreements with the government worth about $40.7 billion. Another interesting tidbit about the company is that in 2013, Marillyn A. Hewson, the chief executive, visited the White House five times; on two of those occasions the “visitee” was “POTUS,” meaning the president of the United States, the logs indicate. And company employees reported giving about $51,000 to the presidential campaign committees Obama for America and the Obama Victory Fund.

There are many disparate identifiers in use across the federal and state governments to identify private-sector companies. Without an algorithm like Enigma's there is no way to map the many separate identifiers to one another to track one company's filings and activities across government.

In 2010, the Treasury Department's Office of Financial Research (OFR) announced it would seek to rally U.S. financial regulatory agencies to adopt a common identifier for the companies and firms reporting to those agencies under the securities, commodities, and banking laws. But that identifier, the Legal Entity Identifier (LEI), has so far only been put in place for derivatives reporting to the Commodity Futures Trading Commission (CFTC) and the Securities and Exchange Commission (SEC). And outside financial regulation, progress toward common identification has been even rarer.

Enigma founder Marc DaCosta says that if government identifiers were standardized, Enigma's platform could become even more powerful, offering access to more, and more reliable, data. Enigma has recently joined the Data Transparency Coalition as the trade association's fourth Startup Member. DaCosta says he hopes the Coalition can persuade governments not only to make more data available overall, but also to make sure that it is standardized through common identifiers.

DaCosta says that the open Internet serves as a good model. When you type in a web address, for example, you need not know on what server that website is located. The Internet works so well in part because identifiers for websites--uniform resource locators or URLs--are universally used and freely available to the public. In a similar way, citizens should be able to run a simple search for a company and have access to all of that company's public data in one place. 

Through its advocacy of government-wide data reporting standards in federal spending, financial regulation, and elsewhere, the Data Transparency Coalition is seeking to make DaCosta's vision a reality.

May 19, 2014

Internship: We need an open data trailblazer this summer in Washington!

Want to see more open data? So do we! 

Fresh off our first major legislative victory with the passage of the DATA Act, the Data Transparency Coalition is now expanding its efforts. In addition to ensuring a successful implementation of America's first open data law, our growing Coalition will redouble its advocacy for open data across all areas of government activity. In addition to spending data, we want to increase our support for open data in the legislative domain, the financial regulatory arena, and much more. 

If you're willing to lend us a hand this summer, you'll get more than $10 an hour -- you'll have an opportunity to engage leading innovators, advocates, and policymakers who are making data transparency a reality. On Tuesdays, you'll work out of our Capitol Hill office. On other days, we'll coordinate a schedule for you to work remotely or in the office. The internship runs from June 1 through August 15, but we can be flexible about starting and ending dates. You will help support all areas of our work -- from policy development to communications and event planning.

The Data Transparency Coalition, founded in 2012, is a tech-industry trade association that wants to transform the federal government's current system of disconnected documents into standardized, open data. Open data serves as a public resource that promotes accountability and nurtures innovation. When information is presented in an open, standardized format, it can be scrutinized by anyone -- from businesses to citizens, journalists and watchdogs. Data standards also reduce compliance costs by allowing reporting tasks to be automated. By providing analysts with reliable Big Data, open data policies enable analysts to transform the practice of public sector management using the latest technology.

Our coalition invites you to join us in this endeavor! If you are interested in applying for our summer internship, please send your resume and a short cover letter explaining why you want to work for the Coalition to

Welcome to the official mouthpiece of the Data Transparency Coalition.