R Programming/Importing and exporting data

Data can be stored in a large variety of formats. Each statistical package has its own format for data (xls for Microsoft Excel, dta for Stata, sas7bdat for SAS, ...). R can read almost all file formats. We present a method for each kind of file. If none of the following methods work, you can use a specific software for data conversion such as the free software OpenRefine or the commercial software Stat Transfer. In any case, most statistical software can export data in a CSV (comma separated values) format and all of them can read CSV data. This is often the best solution to make data available to everyone.

Graphical user interfaces
Some IDE or GUI provides some press button solution to import data.

You may also have a look at speedR, a graphical user interface which helps at importing data from Excel, OpenOfficeCalc, CSV and other text files.

CSV (csv,txt,dat)
You can import data from a text file (often CSV) using,   or. The option  indicates that the first line of the CSV file should be interpreted as variables names and the option   gives the separator (generally "," or ";").

(Hmisc) is another possibility.

Note that there is no problem if your data are stored on the internet.

By default, strings are converted to factors. If you want to avoid this conversion, you can specify the option.

You can export data to a text file using.

For large CSV files, it is possible to use the ff package.

Fixed width text files
and.

Some fixed width text files are provided with a SAS script to import them. Anthony Damico has created SAScii package to easily import those data.

Unstructured text files

 * See  and   in the  Reading and writing text files section.

Stata (dta)

 * We can read Stata data using  in the foreign package and export to Stata data format using.
 * Note that string variables in Stata are limited to 244 characters. This can be an issue during the exportation process.
 * See also  in the memisc package and   in the Hmisc package.

SAS (sas7bdat)
Experimental support for SAS databases having the sas7bdat extension is provided by the sas7bdat package. However, sas7bdat files generated by 64 bit versions of SAS, and SAS running on non-Microsoft Windows platforms are not yet supported.

SAS (xpt)

 * See also   and   in the Hmisc
 * See also the SASxport package.

SPSS (sav)

 * (foreign) and  (Hmisc)

EViews
in the hexView package for EViews files.

Excel (xls,xlsx)
Importing data from Excel is not easy. The solution depends on your operating system. If none of the methods below works, you can always export each Excel spreadsheets to CSV format and read the CSV in R. This is often the simplest and quickest solution.

XLConnect supports reading and writing both xls and xlsx file formats. Since it is based on Apache POI it only requires a Java installation and as such works on many platforms including Windows, UNIX/Linux and Mac. Besides reading & writing data it provides a number of additional features such as adding plots, cell styling & style actions and many more.

The RODBC solution:

The xlsReadWrite package (actually, this package does not exist on CRAN repos, but you can download old versions from CRAN archive).
 * "sheet" specifies the name or the number of the sheet you want to import.
 * "from" specifies the first row of the spreadsheet.

The gnumeric package. This package use an external software called ssconvert which is usually installed with gnumeric, the Gnome office spreadsheet. The  function reads xls and xlsx files.

See also xlsx for Excel 2007 documents and   (gdata).

Google Spread Sheets
You should make the spreadsheet public, publish it as a CSV file. Then you can read it in R using. See more on the Revolution's computing blog (link). See also RGoogleDocs (link).

gnumeric spreadsheets
The gnumeric package. reads one sheet and  reads all sheets and store them in a list.

OpenOffice and LibreOffice (ods)
readODS does not require external dependencies, making it crossplatform.

speedR is another alternative.

Note that you can also use the speedR graphical user interface which will return the command line for replication.

JSON
JSON (JavaScript Object Notation) is a very common format on the internet. The rjson library makes it easy to import data from a json format.

Is is easy to export a list or a dataframe to a JSON format using the toJSON function :

Sometimes the JSON data can be more complex with structures such as nested arrays. In this case you may find it more useful to use an online converter like json-csv.com to convert the file to CSV. Then import the resulting data as per the CSV instructions above.

dBase (dbf)
in the foreign package.

Hierarchical Data Format (hdf5)
hdf5 data can be read using the hdf5 package.

DICOM and NIfTI

 * See "Working with the {DICOM} and {NIfTI} Data Standards in R" in the Journal of Statistical Software

Resources

 * R Data Manual.
 * Paul Murrell's Introduction to Data Technologies.