| by Arround The Web | No comments

Pandas Select Columns

This article will discuss selecting a column or a subset of columns from a pandas DataFrame.

Sample DataFrame

In this article, we will use a sample DataFrame as shown in the example code below:

# import pandas
import pandas as pd
df = pd.DataFrame({
    'products': ['Product1', 'Product2', 'Product3'],
    'price': [100.9, 10.33, 12.00],
    'quantity': [100, 10, 34]},
    index=[1,2,3]
    )
df

The resulting DataFrame is as shown below:

Feel free to use your dataset for better understanding.

Select Columns by Index

The first method we will discuss is selecting columns by their indices. For that, we can use the iloc method.

The syntax is expressed below:

DataFrame.iloc[rows_to_select, [column_indices]]

For example, to get the first and second columns (including all rows), we can do the following:

print(df.iloc[:, [0,1]])

The above should return:

Select Column by Index Range

We can also select multiple columns by specifying their index range. For example, in our sample DataFrame, we can select the columns from index 0 to 3 as shown:

df.iloc[:, 0:3]

This should return the entire DataFrame as shown:

Select Column by Name

To select columns by name, we can use the syntax shown below:

DataFrame[['column_name1', 'column_name2'...]]

An example is as shown below:

df[['products', 'price']]

This should return:

Select Columns Between Column Names

You may need to select columns between two column names in some instances. For that, we can use the syntax shown below:

DataFrame.loc[:, 'start_column':'end_column']

In our example DataFrame, we can do:

df.loc[:, 'products':'quantity']

This should return a DataFrame as shown:

Closing

This article taught us how to use select columns in a Pandas DataFrame using their index positions, index range, and column names.

Thanks for reading!!

Share Button

Source: linuxhint.com

Leave a Reply