IO tools (text, CSV, HDF5, …)

IO tools (text, CSV, HDF5, …)

The pandas I/O API is a set of top level reader functions accessed like pandas.read_csv() that generally return a pandas object. The corresponding writer functions are object methods that are accessed like DataFrame.to_csv(). Below is a table containing available readers and writers.

  1. IO tools (text, CSV, HDF5, …)
  2. Indexing and selecting data
  3. MultiIndex / advanced indexing
  4. Merge, join, and concatenate
  5. Reshaping and pivot tables
  6. Working with text data
  7. Working with missing data
  8. Categorical data
  9. Nullable integer data type
  10. Nullable Boolean Data Type
  11. Visualization
  12. Computational tools
  13. Group By: split-apply-combine
  14. Time series / date functionality
  15. Time deltas
  16. Styling
  17. Options and settings
  18. Enhancing performance
  19. Scaling to large datasets
  20. Sparse data structures
  21. Frequently Asked Questions (FAQ)
  22. Cookbook
Type Data Description Reader Writer
text CSV read_csv to_csv
text Fixed-Width Text File read_fwf -
text JSON read_json to_json
text HTML read_html to_html
text Local clipboard read_clipboard to_clipboard
- MS Excel read_excel to_excel
binary OpenDocument read_excel -
binary HDF5 Format read_hdf to_hdf
binary Feather Format read_feather to_feather
binary Parquet Format read_parquet to_parquet
binary ORC Format read_orc -
binary Msgpack read_msgpack to_msgpack
binary Stata read_stata to_stata
binary SAS read_sas -
binary SPSS read_spss -
binary Python Pickle Format read_pickle to_pickle
SQL SQL read_sql to_sql
SQL Google BigQuery read_gbq to_gbq

Here is an informal performance comparison for some of these IO methods.

Note

For examples that use the StringIO class, make sure you import it according to your Python version, i.e. from StringIO import StringIO for Python 2 and from io import StringIO for Python 3.

Source : .