Archive for August, 2009

NEWS: What datasets should the Bloomberg administration open up?

Mayor Michael Bloomberg is offering to open up.

Responding to the national push for more transparent government, the Bloomberg administration is opening up some of its datasets for easier public consumption. The only question is what data the city will throw up on the new Web site.

The city is taking suggestions starting Monday, and the nonprofit that houses GothamSchools, The Open Planning Project, is part of the push to send those in. We will be helping TOPP fill out what are called RFEI’s, or requests for expressions of interest, this coming Monday.

With the deadline breathing down our necks - on our staycation no less! - we need your help. Our wish list includes information on outside contracts the Department of Education holds, school-by-school budget documents, and school accountability information organized in easy-to-search Excel spreadsheets rather than individual PDF’s.

What should we add? Please name names of specific documents, and please don’t be shy with ideas. Info on how to submit your own RFEI is here.


Photo via Wikimedia commons.

Better Data -> Better Apps -> Better Transportation

Last night’s New York Public Transit Data Summit was a resounding success. We had a room packed with passionate and thoughtful people eager to help make public transit in New York more efficient, accessible, and and easy to use.

For more than two hours, the group — comprised of over two dozen transit advocates, mobile and web developers, urban planners, lawyers, and open government supporters — discussed the current climate for developing transit applications, its shortcomings, and how the community can work with the MTA to improve things.

transit-meetup-small

While no representatives from the MTA were able to attend, the MTA did provide a statement clarifying their current licensing policies. The statement answered many questions but also raised some more. I’ve forwarded those questions on to the MTA and will post their responses as soon as I get them.

Here are some specific items that came out of last night’s meetup:

1. We’ve started compiling notes and thoughts on the Ideas wiki. Check it out at http://nytransitdata.org. Anyone can register and contribute freely.

2. The group decided on #nytransit as the best hashtag, since it’s not too specific, not too general, and not too long.

3. Everyone wanted to keep the momentum going and stay in contact. Since there are already a number of mailing lists and groups working on related projects, we decided not to start another Google group. If you want to stay in contact please join the meetup, and keep an eye on the Open Government NYC, Transit Developers, and Civic Hacker mailing lists.

4. There are lots of tangible next steps to work on. If you care about seeing this issue move forward, please lend a hand. Help draft our one-page issue summary. Fill in holes on the wiki. Fix typos. Tell other developers and interested friends, colleagues, and groups. Get the word out by blogging and tweeting.

Let’s make sure that the wonderful momentum we all built last night doesn’t fade. Together we can help New York take the lead in providing access to transit data and enable a flood of innovation that will revitalize the city’s transportation infrastructure.

Help open the Big Apple

The effort to open municipal data is an initiative with momentum. Inspired in part by the transparency mandate on the federal level that gave us the first ever White House CIO and data.gov, cities across the country are opening up.

One city in particular set the scene before this all hit the national stage: Washington D.C. has delivered precedents like the first online city data directory, the first open API for 311 service requests, and open invitations for developers to produce apps with city data using initiatives like Apps for Democracy.  D.C. has laid the groundwork for a national model by delivering some killer apps. Previously the CTO for D.C., Vivek Kundra is now the national CIO. This model has not yet been fully embraced by New York City, but it’s getting close. In fact the New York State Senate Office of the CIO has been paving the way for opening government data on the state level.

Historically a trend setting city, New York is trying to catch up and even exceed precedents for opening city data. A City Council bill sponsored by Gale Brewer, Int 991, mandates that every city agency will make its data easily and publicly accessible in its raw structured digital form. As a mandate, this legislation would be the first of its kind. Even in D.C. there is only a general policy to release data, not an official mandate. At the public hearing for the bill, a plethora of supporting testimonies were given from the likes of the NY State Senate CIO, the W3C, and the Software Freedom Law Center. While a representative from the Mayor’s office stated that the Mayor supported the spirit of the bill, he expressed that it was too comprehensive and expensive to be practically possible in its current form.

At the Personal Democracy Forum just a few hours before the hearing, Mayor Bloomberg announced the creation of NYC Big Apps, a contest modeled after Washington D.C.’s Apps for Democracy that would let outside developers create applications on top of city data. Given some of the varying approaches to opening city data, the next day Vivek Kundra offered advice to help direct the effort in New York City. The City Council legislation is set to be voted on in early fall.

According to the announcement for Big Apps, 80 of the city’s datasets will initially be made available. The city is currently seeking requests for expressions of interest (RFEI) to determine which datasets should be a priority to open. One problem with access to city data is that it can be difficult to see what data is useful without knowing what data is available. It can also be difficult to know what applications would be useful to build without knowing what New Yorkers want.  To help gauge interest, a Peter Corbett of iStrategy Labs created a simple public venue to let people add or vote for the apps they’re interested in. If you’re interested in submitting a full RFEI, there’s also a process for doing that while keeping it in the public record. The task of showing what data is available is more of a challenge.

New York City doesn’t provide a public directory of the data that each agency maintains. Much of this information used to be available online and the city once even published a full Public Data Directory. Unfortunately, the first and last version of this directory was published in 1993, but it may still provide some utility.  Much of the the city’s IT infrastructure is slow to evolve, so many parts of the 1993 directory are likely unchanged. Regardless of how current it is, the 1993 directory offers a solid framework for developing an up to date version.  An effort is underway to digitize the full directory, allow the public to update it, and submit improvements to the soon to be launched National Data Catalog, a citizen maintained version of Data.gov. With the help of Eric Mill from the Sunlight Foundation’s Sunlight Labs, the challenge of digitizing the NYC Public Data Directory has been broken into small tasks that anyone can contribute to.

One other caveat to opening city data is that some of the agencies perceived to be run by the city are actually run as state agencies. The MTA for example is a quasi-public private enterprise overseen by New York State. While many of the most wanted apps relate to MTA transit, neither the City Council legislation nor the Mayor’s Big Apps program are able to provide MTA data. Because of some of the challenges of accessing and using public transit data, TOPP Labs hosted a New York Public Transit Data Summit to help develop clear and mutually beneficial policies for the relationship among transit agencies like the MTA, their riders, and application developers.

In many cases, the challenge of opening city data is coming up with a sensible policy that sets terms which ultimately provide superior city services. The task of drafting these policies is something that requires a proactive collaboration between developer communities and city government. With sound policy, the challenge then becomes for the city to provide the data in a well structured and timely manner. This too is a place where developer communities can contribute, even if it’s just by doing a task for digitizing the Public Data Directory.  Technology companies can also contribute. Google could help digitize the city’s 6 billion pages of documents much like it has been digitizing the world’s libraries. Amazon has already offered to host public datasets for free. What data would you like to see, what applications would you like to use?

With your help, we can open up the Big Apple.

If you’d like to suggest applications or data for the city to make available, you can add and vote for them at http://bit.ly/bigideas. To submit your own RFEI and make it public, you can add it to http://bit.ly/getnycdata. To help update the Public Data Directory you can first help digitize the old version http://bit.ly/digitizepdd

Ben Fried on ‘The Brian Lehrer Show’ Today

Ben, along with Yonah Freemark of The Transport Politic, will be on WNYC this morning to talk about Mayor Bloomberg's transit platform. The segment should air between 10:30 and 11.

Community Almanac Redesign

Community Almanac

As a point of reference for those unfamiliar with Community Almanac, here are screenshots of the previous version’s home page and almanac view.

We’re very excited about the new version of the Orton Family Foundation’s Community Almanac that we’ve been working on at TOPP Labs. In this post I’ll share some of the design decisions that went into that project and some of the reasons why things have changed.

The Opening Line

In order to make it easier for users to contribute to the site, we decided to ditch the previous workflow—an impeding multi-step wizard that walked you through registering and adding your story—and we replaced it with a more unobtrusive workflow. Now users can jump right in and start adding whatever content they want in any order they like, even before they’ve registered or signed in.

Below are some of the wireframe mockups outlining this new just-in-time workflow:

Anonymous Page Creation Workflow

Page View

Editing a Page

Almanac View - Table of Contents

Home Page

User Profile Page

Note: There are a few great ideas in these mockups that didn’t make it into this iteration: namely user profiles and an achievement rewards system. Maybe they’ll make it into a future release?

The Well-Worn Page

In addition to improving the workflow, we wanted to push a new book/page metaphor throughout the site. What was previously referred to as a story is now a page. Users add pages to their community’s almanac, or book, with the site as a whole is presented more as a collection of almanacs or library of books.

Not only was the term “story” inaccurate, as users can publish whatever type of content they like—text, images, maps, audio, video, or PDFs, none of which are necessarily narrative—but the term “page” is also more straightforward and familiar. Users are accustomed to viewing and adding “pages” on a website. Additionally, extending the “page” metaphor to include books and a library provided a more cohesive aesthetic direction and was also just more fun.

The Carefully Considered Cover

Once our ideas for the new user workflow and bibliophilic metaphors were established, I kicked off the visual design with a mood board in order to set the general tone for the look & feel:

Community Almanac Mood Board

The mood board sought to create a warm, inviting, and familiar vibe based that reflects the “heart & soul” sentiments of the Orton Foundation by referencing traditional, tactile imagery.

Community Almanac Sketch

With this mood board and the book page metaphor in mind, I began sketching ideas for the basic structure and layout. The final sketch depicts an off-centered book where the recto page contains the primary content and the verso page acts as a sidebar for secondary content.  The charm of this layout resides in its flexible width. Although it behaves as a fixed-width layout, it’s not. In larger windows the sidebar items float beside each other, forming columns across the left-hand page. If your monitor is ridiculously large you can even see the whole book!

This fluid layout inspired quite a few bells and whistles, not to mention a couple of fun Easter eggs. The informational tour has been moved to home page as a little slideshow-style presentation that directs users to the map or signup. There’s also a cool slide-down login area and shuffling stack of almanacs on the home page. Check out the parallax scrolling that happens in the header illustration while you resize the width of the browser window. And we didn’t stop there! Who knows what else you’ll find… But hopefully you’ll never see these witty error pages in action: 400 (Client Error), 500 (Server Error).

Finding Your Community: The Map Workflow

While we were really excited to launch all these new features,we knew that the current map workflow wasn’t quite right and needed to be changed. In order to solicit user feedback on the other new features, we decided to launch in iterative stages.

In order to find a community in the current map workflow we perform two geocode requests. First, we geocode based on what the user types. Then, if there’s a valid result, we find the latitude and longitude given back by the first result and reverse geocode it. With this method, we only get cities or towns as the result of searching for neighborhoods or smaller communities and can thus limit the fragmenting of communities by creating a canonical, publicly-owned almanac for each location.

Now almanacs are now publicly owned, rather than moderated by a particular user. On the old Community Almanac site the user that starts an almanac would personally own that almanac. If, for example, the user “Hatfield” started a Tug Fork River almanac he could moderate out the user “McCoy”. Then “McCoy” could in turn start another warring Tug Fork River almanac. While this new map workflow is great for creating canonical almanacs, it does expose another problem…

Finding Your Community: Neighborhoods

Sometimes prominent communities are geographically contained within the cities/towns that a map search would return. For instance, a community such as Third Ward, Texas—a neighborhood within Houston, Texas—will geocode simply as “Houston.”

So, in our first iteration, we’ve included a help link under the map for users having trouble finding their community. As a temporary solution, if a user cannot find their community on the map, they can contact us and have their almanac created for them manually. This method isn’t really optimal. What’s needed is a search method that will return canonical results within varied levels of specificity. In our next iteration, we plan to make the map workflow more robust and limit manual intervention.

Finding Your Community: Polishing the User Experience

We had to step back and take a look at our options, and among the miscellaneous fields returned by Google are four gems:

  1. Locality (city/town)
  2. State
  3. AddressLine
  4. Accuracy

We’re currently only using the first two, Locality and State but what is potentially useful is AddressLine combined with Accuracy. An AddressLine is what google recognizes as a feature of a specific point. This can be very specific (like “Columbus Circle”), a little more general (like “Central Park”), on up to the city, state, or country that the point is contained in. Accuracy is an approximation of the size of a feature, and helps us clue in to what neighborhoods the point is referring to. It’s not perfect, but it gives us some choices we can present to the user to let them decide.

To use a search for “third ward houston texas” as an example, if we discard any AddressLine results with an Accuracy greater than 5 or less than 3 (to pick limits based on this one search), we can offer up the following choices to the user:

  • Greater Third Ward (with an accuracy of 4)
  • South Central (with an accuracy of 4)
  • Houston (with an accuracy of 4)
  • Greater Houston (with an accuracy of 3)

Almanac moderation may still be needed since AddressLine is far less predictable than current city/town results. But if we check them against existing almanac titles, we can hopefully remove any manual intervention for all users, offering neighborhood-level specificity to almanac search and creation.

So, Where’s Your Community?

Dig in. Find your community on the map and start adding to its almanac. Share your heart & soul! We think this redesign turned out great and we’d love to get your feedback. Let us know what you think.

New York Public Transit Data Summit. With Beer.

We’re hosting a New York Transportation Data Summit. With beer. While the event will be open to the public, we’ve specifically reached out to MTA employees, open government advocates, application developers, and transit enthusiasts.

Here’s the scoop:

WHERE: 148 Lafayette St, NY, New York, 12th floor (map)
WHEN: Tuesday, August 25 at 6pm
WHAT: Meetup to discuss how the MTA and the developer community can best collaborate.


Access to transportation schedules has been a hot topic lately. Most recently, iPhone developer Chris Schoenfeld has  come in conflict with the MTA over schedule data. Chris wrote an app that uses schedules for Metro-North, and the MTA wants him to pay royalties for his use of the data from both past and future sales of his application. Chris has refused, noting that data under US law is not copyrightable and thus he is legally free to use and distribute it as he sees fit.

A couple of us at TOPP Labs have also run into issues trying to get MTA schedule data. We’re always experimenting with transit-related applications (from bus trackers to trip planners) and public data is integral to many of them. This past March, we requested MTA bus schedule data via a FOIL request. Within a month, the MTA had responded by sending us a CD containing bus route and schedule information for all the New York City Transit bus lines.  This data was in an undocumented format, and we set out reverse engineering it with the goal of writing a parser to generate GTFS data (we’ve got the parser working and released it as free software and you can download the GTFS).

Not long into the process, sections of Broadway were closed down for vehicle traffic. As a result, several bus routes were changed, making some of the data the MTA had sent us obsolete.

We wrote to the MTA to try to figure out how best to keep our data up to date. They told us that we must file a FOIL request every month or two and that there was no way to know when the schedules would be updated. But having one or two month old data isn’t of much value to us; we tried sending FOIL requests more frequently but quickly found that this angered the MTA, since their process for fielding a FOIL request is somewhat laborious. Since we’re not interested in making the MTA’s life harder, we stopped and started pursuing other avenues for getting up to date data.

Over the past months we’ve gone back and forth with the MTA several times, trying to find a way to get up to date data. So far, we haven’t found anything mutually satisfactory. But we’re still hopeful that we can find a good solution.

Everyone involved wants the same things. We all want better public transit. We all benefit when reliable data is easy to come by. The MTA benefits because independent developers write applications that make the infrastructure the MTA maintains more valuable for its riders. And this means the MTA doesn’t need to pay to develop apps for the iPhone, Android, Pre, Blackberry, and whatever comes next. Riders benefit because they can access schedules in the way that’s easiest and most convenient for them.

New York has a vast transportation network with complicated schedule data, and it’s inevitable for some errors to slip by. We’ve noticed a few oversights in the data we’ve received (e.g, the S55 shape data is out of date, the schedule data doesn’t distinguish between the M14A and M14D) and would love to help fix them. The net is full of examples of crowdsourcing for correcting errors (“given enough eyeballs, all bugs are shallow”), but there’s not currently a process in place for citizens to submit corrections.

The MTA is understandably concerned about inaccurate or out of date data giving them a bad name. They don’t want riders getting bad data and then blaming them for it, and neither do we. And the MTA doesn’t want to spend all their time responding to FOIL requests. It’s in nobody’s interest to make the MTA’s already tough job harder; as taxpayers, we want to help the MTA be as efficient as possible. As New Yorkers, we want the city to stay on the cutting edge of public transportation.

We think these problems are solvable. That’s why we propose a meeting of the minds. We think progress is made when people come together, honestly discuss their goals, and work cooperatively to reach mutually beneficial solutions.

And that’s why we’re hosting a New York Transportation Data Summit. With beer. While the event will be open to the public, we’ve specifically reached out to MTA employees, open government advocates, application developers, and transit enthusiasts.

Here’s the scoop:

WHERE: 148 Lafayette St, NY, New York, 12th floor (map)
WHEN: Tuesday, August 25 at 6pm
WHAT: Meetup to discuss how the MTA and the developer community can best collaborate.

Please come join us for pizza, beer, and a friendly discussion. There’s also a stunning view of the city we all love.

(This post was written by David Turner and Nicholas Bergson-Shilcock)

NYC Summer Streets 2009

On Saturday the New York City Department of Transportation and partners kicked off the second annual Summer Streets. A car-free zone was created from 7 AM to 1 PM starting at 72nd Street and traveled mainly along Park Ave to the Brooklyn Bridge. New York City Department of Transportation Commissioner Janette Sadik-Khan estimated that double the number of people visited the 7 mile route compared to the first Summer Streets event last year. If you missed Summer Streets on Saturday, don't worry for you have two more chances to experience the car-free bliss on August 15th and 22nd.

GeoServer in InfoGEO magazine

GeoServer is featured in an article in InfoGEO magazine. Written by active community member Fernando Quadro, the article is a brief overview of GeoServer. There may not be anything new here for those who are already familiar with GeoServer, but it’s still great to see GeoServer in print.

Original Article
English translation (via Google Translate)

This is a good time to remind everyone that GeoServer has mailing lists in Portuguese, along with Italian and Spanish. Moreover, with the switch to our new documentation system, we now have the ability to have GeoServer documentation in multiple languages. If you are interested in contributing, please let us know.