How to select the columns from Dataframe in python?

Pandas DataFrame.iloc[]

The DataFrame.iloc[] is used when the index label of the DataFrame is other than numeric series of 0,1,2,…,n or in the case when the user does not know the index label.

We can extract the rows by using an imaginary index position which is not visible in the DataFrame. It is an integer- based position(from 0 to length-1 of the axis), but may also be used with the boolean array.

The allowed inputs for .loc[] are:

  • Integer value, e.g. 7.
  • List or array of integers, e.g [2, 5, 6].
  • Slice object with ints, e.g., 1:9.
  • boolean array.
  • A callable function with one argument that can be the calling Series or the DataFrame. It returns valid outputs for indexing.

It can raise the IndexError if we request the index is out-of-bounds, except slice indexers, which allow the out-of-bounds indexing.

Syntax:

pandas.DataFrame.iloc[]

Parameters:
None

Returns:
It returns the DataFrame or the Series.

Example:

import pandas as pd  
a = [{'p': 2, 'q': 4, 'r': 6, 's': 8},  
{'a': 200, 'b': 400, 'c': 600, 'd': 800},  
{'p': 2000, 'q': 4000, 'r': 6000, 's': 8000 }]  
info = pd.DataFrame(mydict)  
type(info.iloc[0])  
<class 'pandas.core.series.Series'>  
info.iloc[0] 

Output:

a1
b2
c       3
d4
Name: 0, dtype: int64

Pandas DataFrame.loc[]

The DataFrame.loc[] is used to retrieve the group of rows and columns by labels or a boolean array in the DataFrame. It takes only index labels, and if it exists in the caller DataFrame, it returns the rows, columns, or DataFrame.

The DataFrame.loc[] is a label based but may use with the boolean array.

The allowed inputs for .loc[] are:

  • Single label, e.g., 7 or a . Here, 7 is interpreted as the label of the index.
  • List or array of labels, e.g. [‘x’, ‘y’, ‘z’].
  • Slice object with labels, e.g. ‘x’:‘f’.
  • A boolean array of the same length. e.g. [True, True, False].
  • callable function with one argument.

Syntax

pandas.DataFrame.loc[]

Parameters
None

Returns
It returns Scalar, Series or DataFrame.

Example

# importing pandas as pd

import pandas as pd  
# Creating the DataFrame  
info = pd.DataFrame({'Age':[32, 41, 44, 38, 33],   
                   'Name':['Phill', 'William', 'Terry', 'Smith', 'Parker']})   
# Create the index   
index_ = ['Row_1', 'Row_2', 'Row_3', 'Row_4', 'Row_5']   
  
# Set the index   
info.index = index_   
  
# return the value   
final = info.loc['Row_2', 'Name']   
  
# Print the result   
print(final)  

Output:

William

Example2:

# importing pandas as pd  
import pandas as pd  
# Creating the DataFrame  
info = pd.DataFrame({"P":[28, 17, 14, 42, None],    
                   "Q":[15, 23, None, 15, 12],    
                   "R":[11, 23, 16, 32, 42],    
                   "S":[41, None, 34, 25, 18]})    
# Create the index   
index_ = ['A', 'B', 'C', 'D', 'E']   
# Set the index   
info.index = index_   
# Print the DataFrame  
print(info)  

Output:

  P         Q      R         S
A   28.0    15.0    11   41.0
B   17.0    23.0    23   NaN
C   14.0    NaN    16   34.0
D   42.0   15.0     32   25.0
E NaN    12.0    42   18.0

Now, we have to use DataFrame.loc attribute to return the values present in the DataFrame.

# return the values   
result = info.loc[:, ['P', 'S']]   
# Print the result   
print(result)  

Output:

   P    S
A 28.0  41.0
B 17.0   NaN
C14.0  34.0
D  42.0  25.0
ENaN  18.0