| by Arround The Web | No comments

Pandas Between Dates

Pandas is a well-known Python library for analyzing and manipulating data. It provides various modules and functions for different kinds of tasks. Pandas can be used to filter and select data based on dates, for example, we can analyze sales data for a specific month, quarter, or year or compare the performance of different products over time using the various Pandas methods.

This post presents a comprehensive tutorial on selecting Pandas DataFrame rows between specified dates.

How to Select Rows Between Two Pandas DataFrame Dates?

To select rows between two dates in Pandas DataFrame, the following methods are used in Python:

Method 1: Select Rows Between Two Pandas DataFrame Dates Using the “df.loc[]”

In Python, the “dataframe.loc[]” method is used to extract or select the data frame rows by accepting the index value/labels as an argument. This method can be utilized to select rows between two dates from the Pandas DataFrame. To understand it, let’s look at an example:

import pandas
data1 = ({'Name':["Python","Java","C++","Linux"],
          'id_no' :[22, 50, 33, 44], 'Fee':[500, 300, 100, 120],
          'starting_dates':["2023-03-04","2023-04-04","2023-05-04","2023-06-04"]})
df = pandas.DataFrame(data1)
print('Given DataFrame:\n', df)

date1 = '2023-01-25'
date2 = '2023-04-25'
print('\n', df.loc[(df['starting_dates'] > date1) & (df['starting_dates'] <= date2)])

In the above code:

  • The “pandas” library is imported, and the “pandas.DataFrame()” function takes the input data, including specified dates, as an argument and retrieves the DataFrame.
  • The specified start and end dates are assigned to the variables.
  • The “loc()” method selects the rows based on the specified condition passed as an argument.

Output

The names “Python” and “Java” have been selected according to the particular date.

Method 2: Select Rows Between Two Pandas DataFrame Dates Using the “df.query()”

The “df.query()” method takes the string query expression as an argument and retrieves the DataFrame. This method is utilized to retrieve the rows from the particular DataFrame. To demonstrate how to select/get rows between two dates using the “df.query()” method utilize this example:

import pandas
data1 = ({'Name':["Python","Java","C++","Linux"],
          'id_no' :[22, 50, 33, 44], 'Fee':[500, 300, 100, 120],
          'starting_dates':["2023-03-04","2023-04-04","2023-05-04","2023-06-04"]})
df = pandas.DataFrame(data1)
print('Given DataFrame:\n', df)

date1 = '2023-03-25'
date2 = '2023-05-25'
print('\n', df.query('starting_dates >= @date1 and starting_dates <= @date2'))

In the above code:

  • The “query()” method takes the string query expression as an argument and retrieves the data frame based on the passed condition.
  • The “query()” method is wrapped within the “print()” function to display the filtered data on the screen.

Output

Two rows have been extracted from the Pandas DataFrame based on the specified dates.

Method 3: Select Rows Between Two Pandas DataFrame Dates Using the “df.isin()”

In Python, the “isin()” method checks if the specified value is present/exists in the input DataFrame. This method can be utilized to fetch/get the rows between two dates in Python. The following example selects rows between two dates in Pandas DataFrame:

import pandas
Name = ["Python","Java","C++","Linux"]
id_no = [22, 50, 33, 44]
Fee = [500, 300, 100, 120]
starting_dates = ["2023-03-04","2023-04-04","2023-05-04","2023-06-04"]
df = pandas.DataFrame({"Name":Name,"Fees":Fee, "Id_No":id_no, 'starting_dates': pandas.to_datetime(starting_dates)})
print('Given DataFrame:\n', df)

date1 = '2023-03-25'
date2 = '2023-05-25'
print('\n', df[df["starting_dates"].isin(pandas.date_range("2023-05-01", "2023-07-01"))])

In this code:

  • The “DataFrame()” takes the specified data as an argument and creates a DataFrame. The “pandas.to_datetime()” takes the date list data and converts it into a DateTime object.
  • The “isin()” method extracts the rows data from the Pandas DataFrame based on the given dates.

Output

The rows in the Pandas DataFrame that fall between the specified dates have been extracted.

Conclusion

In Python, “df.loc[]”, “df.query()”, and the “df.isin()” methods are used to select the rows of Pandas DataFrame that fall within the specified dates. The “pd.DataFrame()” is used to construct DataFrame, and the specified methods extract the data from the given DataFrame based on the specified conditions. This Python post has presented a detailed guide on selecting rows between Pandas DataFrame dates.

Share Button

Source: linuxhint.com

Leave a Reply