A Digital India Initiative
-A A +A

Open Gorvernment Data

A dataset is said to be open if anyone is free to use, reuse, and redistribute it – Open Data shall be machine readable and it should also be easily accessible.

PMC collects processes and generates a large amount of data in its day-to-day functioning. But a large quantum of government data remains inaccessible to citizens, civil society, although most of such data may be non-sensitive in nature and could be used by public for social, economic and developmental purposes.

These data need to be made available in an open format to facilitate use, reuse and redistribute; it should be free from any license or any other mechanism of control. Opening up of government data in open formats would enhance transparency and accountability while encouraging public engagement. The government data in open formats has a huge potential for innovation building various types of Apps, mash-ups and services around the published datasets.

1.    Identification of Resources (Datasets/Apps) and their organization under Catalogs

As per policy each department has to prepare it‘s Negative List. The datasets which are confidential in nature and are in the interest of the country‘s security in not opening to the public would fall into the negative list.

All other datasets which do not fall under this negative list would be in the Open List. These datasets would need to be prioritized into high value datasets and non-high value datasets. The data sets then must be published on the portal in 3 (three) months.

The data which are contributed to the Pune DataStore portal have to be in the specified open data format only. The data have to be internally processed to ensure that the quality standard is met i.e. accuracy, free from any sort of legal issues, privacy of an individual is maintained and does not compromise with the National security.

While prioritizing the release of datasets, one should try to publish as many high value datasets. Grouping of Related Resources (Datasets/Apps) should be planned and are to be organized under departments.

Though each department shall have its own criterion of high value and low value datasets, generally High value data is governed by following Principles:

  • Completeness
  • Primary
  • Timeliness
  • Ease of Physical and Electronic Access
  • Machine Readability
  • Non-discrimination
  • Licensing
  • Permanence

2.   Data Formats

NDSAP recommends that data has to be published in open format. It should be machine readable. Though there are many formats suitable to different category of data. Based on current analysis of data formats prevalent in Government it is proposed that data should be published in any of the following formats:

  • CSV (Comma separated values)
  • XLS (Spread sheet - Excel)
  • ODS (Open Document Formats for Spreadsheets)
  • XML (Extensive Markup Language)
  • KML (Keyhole Markup Language used for Maps)
  • GML (Geography Markup Language)
Suggest a Dataset