Pulsars and neutron stars/Accessing and processing pulsar data sets

Introduction
Many pulsar data sets are now available for public access. These include raw data files from telescopes as well as processed data files. Of course these data files are made available for general use, but do consider the following issues:


 * For many of the raw observations from radio telescopes, information has been lost on the quality of the data. For instance, issues may have occurred during the observation that are not recorded with the data.
 * Many people have spent a long time carrying out the observations. Please ensure that, if possible, you credit the observers
 * Old data formats may not be understood by modern software packages
 * You may spend a long time processing some public data and then find that somebody else publishes the work before you do.

Raw data
Raw data files are those produced by the backend instruments during the observations. They usually require some form of processing (such as RFI mitigation or calibration).

Parkes telescope
Apart from a few cases (decided by the director) data from the Parkes radio telescope are public 18 months from the time of the observation. In early 2010, the ANDS-CSIRO-ATNF Pulsar Data Management Project was funded through the Australian National Data Service (ANDS) to establish a data archive for pulsar radio astronomy data. The project was officially completed in March 2011, but work is still ongoing to retrieve old data sets for inclusion in the archive. Of course, the archive also includes recent observations with the telescope.

The data are stored in the archive as collections. Each collection consists of one semester of data for a given observing project. For instance the collection

P855-2013OCTS

contains all the data for the P855 project obtained during the 2013 October semester. The project code can be identified through the Opal website. The following projects are of general interest:

Each observation can be recorded with multiple backend instruments. Each backend instrument can record one or more files per observation. Even though each backend can produce files in its native format, for archiving purposes all files are converted to the PSRFITS format. The file extension defines the type of observation:

The first character in the filename indicates the backend that was used to record the data:

An entire collection of data can be downloaded (assuming that the observations are out of the embargo period) by searching for the specific collection in the CSIRO DAP website. More commonly a search for observations from a particular pulsar or in a particular sky position can be carried out. Currently the data are downloaded via web tools. This restricts the amount of data that can viably be downloaded.

Catalogues
The following catalogues are of interest to pulsar astronomers:

Using the Virtual Observatory
Virtual observatory (VO) tools provide access to catalogues and databases in many areas of astronomy. To date very little use of the VO has been made by the pulsar community, but that may change with new large data sets and catalogues and multi-wavelength research projects. Data and catalogues are provided by virtual observatory registries. The ATNF publishing registry provides information about the Parkes data archive.

Table Access Protocol (TAP) queries
Let's assume that we wish to identify all the observations available in the Parkes data archive that could also be observable using the FAST telescope. The FAST telescope has a declination limit of -16 degrees and so we need to identify observations at a higher declination. The TOPCAT utility is ideal for such queries. After downloading and running TOPCAT, click on the VO menu bar and select "Table Access Protocol (TAP) Query". It is then necessary to find the correct database - type "ATNF pulsar" into the Keywords box and click on "Submit Query". This allows you to identify the ATNF Publishing Registry. It is then possible to "Enter Query". Currently ADQL is available as the query language. The relevant ADQL query is:

Submitting this query returns ...

Processing using Virtual Machines and cloud computing
Currently each pulsar group around the world has their own computer systems, they install the various software packages, copy data sets of interest to their disks and then carry out their analysis. Pulsar astronomers therefore end up spending significant amounts of time on data transfer, installation of software and in searching for funding in order to purchase a more powerful machine.

Virtual machines provide a method in which software packages and pipelines can be pre-installed and then run on any system. For instance, a Virtual Machine running pulsar software packages under the Linux operating system can be run on a windows laptop.

It is also possible to run pipelines and software on a Virtual Machines running elsewhere. The user may not even know where physically the computers or data are.