| by Arround The Web | No comments

Pandas Filter by Date

Pandas library is used for different data analysis tasks, and one such task is filtering data based on dates. To select the data from a specific time period or to compare data across different time periods, the filtration-by-date operation is performed on the specified data. There are several ways to filter pandas DataFrame rows by date, depending on the format and structure of the data.

This Python tutorial presents an extensive guide on how to filter Pandas DataFrame based on particular dates via the below content:

Method 1: Filtering Pandas DataFrame by Dates Using “df.loc[]” Method

In Python, the “df.loc[]” method of the Pandas is used to select the rows or columns groups by accepting their labels value. This method filters the Pandas DataFrame according to the particular dates. For example:

import pandas
df = pandas.DataFrame({'courses': ['Python', 'Java', 'Linux', 'C++', 'C#', 'JavaScript'],'start_date': ['2023-02-01', '2023-03-01', '2023-04-01', '2023-05-01', '2023-06-01', '2023-07-01']})
df['start_date'] = pandas.to_datetime(df['start_date'], format='%Y-%m-%d')
print(df.loc[(df['start_date'] >= '2023-04-01') & (df['start_date'] < '2023-07-01')])

In the above code:

  • The “pandas” module is imported.
  • The “DataFrame()” function is used to create the DataFrame with the “Courses” and “start_data” columns.
  • The “to_datetime()” function converts the “start_date” column into a DateTime object in a particular format.
  • The “loc()” method is used to filter the Pandas DataFrame by specifying the date condition.

Output

The Pandas DataFrame has been filtered according to the particular dates.

Method 2: Filtering Pandas DataFrame by Dates Using “df.query()” Method

The “df.query()” method takes the query expression as an argument and gets “True” or “False”. This method can be utilized to filter the Pandas DataFrame based on the dates of the specified columns. Take the below example:

import pandas
df = pandas.DataFrame({'courses': ['Python', 'Java', 'Linux', 'C++', 'C#', 'JavaScript'],'start_date': ['2023-05-01','2023-03-01', '2023-08-01', '2023-02-01', '2023-04-01', '2023-07-01']})
df['start_date'] = pandas.to_datetime(df['start_date'], format='%Y-%m-%d')
print(df.query("start_date >= '2023-03-01' and start_date < '2023-05-30'"))

In the above code:

  • The “query()” method filters the Pandas DataFrame based on a date by taking the query expression as an argument.
  • The filtered rows data is passed to the print() function to display it on the console.

Output

The above snippet demonstrates that the DataFrame of Pandas has been filtered by date.

Method 3: Filtering Pandas DataFrame by Dates Using “df.isin()” Method

In Python, the “df.isin()” method is used to determine whether the specified value exists in the Pandas DataFrame or not. This method and the “pd.date_range()” function are used to filter the DataFrame according to the specified dates. For further understanding, let’s look at the below code:

import pandas
df = pandas.DataFrame({'courses': ['Python', 'Java', 'Linux', 'C++', 'C#', 'JavaScript'],'start_date': ['2023-05-01','2023-03-01', '2023-08-01', '2023-02-01', '2023-04-01', '2023-07-01']})
df['start_date'] = pandas.to_datetime(df['start_date'], format='%Y-%m-%d')
print(df[df["start_date"].isin(pandas.date_range('2023-06-21', '2023-09-30'))])

In the above code:

  • The DataFrame is created, and the data type of the “start_date” column is changed to a datetime object.
  • The “isin()” method is used along with the “pd.date_range()” function to filter rows based on the specified dates of the “start_date” column.

Output

The DataFrame has been filtered based on the specified dates.

Conclusion

In Python, the “df.loc[]”, “df.query()”, or “df.isin()” methods are used to filter the Pandas DataFrame based on the specified date. The “df.loc()” filters the Pandas DataFrame by taking the specified date condition as an argument. On the other hand, the “df.query()” method accepts the string query specifying multiple dates as arguments to filter DataFrame. The “df.isin()” method can be utilized along with the “pd.date_range()” method to filter a DataFrame. This tutorial presented a detailed guide on filtering DataFrame based on dates.

Share Button

Source: linuxhint.com

Leave a Reply