How will you efficiently load data from a text file?

rohit-lotlikar-b6b3af34 · 22 November 2021 12:55

We can use the method numpy.loadtxt which can automatically read the file’s header and footer lines and the comments if any.

This method is highly efficient and even if this method feels less efficient, then the data should be represented in a more efficient format such as CSV etc. Various alternatives can be considered depending on the version of NumPy used.

Following are the file formats that are supported:

Text files: These files are generally very slow, huge but portable and are human-readable.
Raw binary: This file does not have any metadata and is not portable. But they are fast.
Pickle: These are borderline slow and portable but depends on the NumPy versions.
HDF5: This is known as the High-Powered Kitchen Sink format which supports both PyTables and h5py format.
.npy: This is NumPy’s native binary data format which is extremely simple, efficient and portable.