| by Arround The Web | No comments

Pandas Split Columns by Delimiter

The popular Python library named “Pandas” supported DataFrame (data structure) to organize and manipulate data. While working with DataFrame, we perform different operations on it, such as adding and removing columns, manipulating the index, and others. One such operation in Pandas DataFrame is to split string columns based on a specified delimiter value. This can be done in Python utilizing the “Series.str.split()” function.

This article presents a tutorial on splitting Pandas DataFrame columns by delimiter value using numerous examples. For this purpose, this guide elaborates on the following content:

How to Split Columns by Delimiter in Pandas DataFrame?

To split columns by delimiter in Pandas DataFrame, the “Series.str.split()” function is used in Python. This function is similar/equivalent to the “split()” function but applies to the entire DataFrame columns. Here is the syntax:

Series.str.split(pat=None, n=-1, expand=False)

In the above syntax:

  • The “pat=None” parameter represents a string, regular expression, or delimiter symbol to split on. If this value is not provided, then it splits on whitespace.
  • The “n=-1” parameter specifies the limit/number of splits in output.
  • The “expand” parameter expands the split strings into separate/individual columns. This function will retrieve DataFrame if it is set to “True” and retrieves Series by default.

Return Value

The “Series.str.split()” method retrieves the DataFrame/Series.

Example 1: Splitting String Columns Using “-” Delimiter

The below code is used to split the specified string column into two new particular columns:

import pandas
df = pandas.DataFrame({'Info':["Joseph_45", "Anna_23", "Lily_15", "Henry_22"],'Age':[23, 25, 17, 39],'salary':['$543', '$200', '$220', '$250']})
print(df,'\n')
df[['Name', 'Id_no']] = df['Info'].str.split("_", expand = True)
print(df)

Here, the “pandas” module is imported, and the “pd.DataFrame()” is used to create a DataFrame. The “str.split()” method is used with the DataFrame to split the specified string columns (Info) by delimiter “”. Moreover, the “expand=True” parameter is used to return the DataFrame.

Output

The specified string columns have been separated by the specified delimiter and split into two columns.

Example 2: Splitting String Columns Using “,” Delimiter

Let’s overview the following code:

import pandas
df = pandas.DataFrame({'Info':["Joseph,45", "Anna,23", "Lily,15", "Henry,22"],'Age':[23, 25, 17, 39],'salary':['$543', '$200', '$220', '$250']})
print(df,'\n')
df[['Name', 'Id_no']] = df['Info'].str.split(",", expand = True)
print(df)

In the above code, the “delimiter (,)” is passed to the “series.str.split()” function to split the specified string columns into two new columns.

Output

The specified string column “Info” has been separated by the delimiter “,” into two new columns “Name” and “Id_no”.

Alternative Method: Split Column by Delimiter Into Two New Columns Using the “apply()” Method

The Pandas “df.apply()” function is used to apply the specified function along a DataFrame axis. This function can also be utilized to split the columns by the specified delimiter value. Here is an example code:

import pandas
df = pandas.DataFrame({'Info':["Joseph_45", "Anna_23", "Lily_15", "Henry_22"],'Age':[23, 25, 17, 39],'salary':['$543', '$200', '$220', '$250']})
print(df,'\n')
df[['Name', 'Id_no']] = df['Info'].apply(lambda x: pandas.Series(str(x).split("_")))
print(df)

In the above code, the “df.apply()” function takes the specified “lambda” function as an argument and splits the selected column of the Pandas DataFrame by the specified delimiter.

Output

Two columns have been created from the specified string columns.

Conclusion

In Python, the “Series.str.split()” function takes a “delimiter” and “expand=True” as arguments and splits the given string columns based on the specified delimiter. We can also use the “apply()” method to split the string columns by the specified delimiter. This tutorial presented an extensive overview of how to split columns by delimiter in Python using numerous examples.

Share Button

Source: linuxhint.com

Leave a Reply