join() method

Based on a common key, combines columns from two dataframes into a single dataframe.

Syntax

left_dataframe_name.join(other = right_dataframe_name, lkey = ["left_key_column", "...n"], lcolumns = ["left_column", "...n"], rkey = ["right_key_column", "...n"], rcolumns = ["right_column", "...n"], join_type = "inner"|"left"|"right"|"outer"|"unmatched")

Parameters

Name

Description

other = right_dataframe_name

The name of the right dataframe.

lkey = ["left_key_column", "...n"]

The key column or columns in the left dataframe.

Columns are used in the order that you list them. If you are joining on more than one column, corresponding columns must be in the same order in lkey and rkey.

Note

Matching of left and right key values is case sensitive. Matching values are only considered identical if their casing is also identical.

lcolumns = ["left_column", "...n"]

optional

The columns from the left dataframe to include in the joined output dataframe. Columns are included in the order that you list them.

If you omit lcolumns, all columns from the left dataframe are included in the joined output dataframe.

If you want to exclude all left columns, specify lcolumns = [].

Note

The lkey column or columns are automatically included in the output dataframe. You do not need to re-specify them in lcolumns.

rkey = ["right_key_column", "...n"]

The key column or columns in the right dataframe.

Columns are used in the order that you list them. If you are joining on more than one column, corresponding columns must be in the same order in rkey and lkey.

Note

Matching of left and right key values is case sensitive. Matching values are only considered identical if their casing is also identical.

rcolumns = ["right_column", "...n"]

optional

The columns from the right dataframe to include in the joined output dataframe. Columns are included in the order that you list them.

If you omit rcolumns, all columns from the right dataframe are included in the joined output dataframe.

If you want to exclude all right columns, specify rcolumns = [].

rcolumns is ignored if you use the unmatched join type.

Note

The rkey column or columns are automatically included in the output dataframe. You do not need to re-specify them in rcolumns.

join_type = "inner" | "left" | "right" | "outer" | "unmatched"

optional

The type of join to perform.

If you omit join_type an inner join is performed by default.

Join type		The joined output dataframe contains:
	"inner"	All matched left rows and all matched right rows One row for each match between the left and right dataframes
	"left"	All left rows (matched and unmatched) and all matched right rows
	"right"	All right rows (matched and unmatched) and all matched left rows
	"outer"	All left and all right rows, matched and unmatched
	"unmatched"	Unmatched left rows

Returns

HCL dataframe.

Examples

Join two dataframes as a way of discovering employees who may also be vendors

The example below joins the employees dataframe (left) and the vendors dataframe (right) using address as the common key column (the Address and Vendor_Street columns).

The join() method creates a new dataframe with matched left and right rows, which results in a list of any employees and vendors with the same address.

employee_vendor_match = employees.join(other = vendors, lkey = ["Address"], lcolumns = ["Empno", "First", "Last"], rkey = ["Vendor_Street"], rcolumns = ["Vendor_No", "Vendor_Name"], join_type = "inner")

Join two dataframes as a way of discovering accounts receivable rows with no matching customer

The example below joins the accounts_receivable dataframe (left) and the customers dataframe (right) using customer number (CustNo) as the common key column.

The join() method uses the unmatched join type to create a new dataframe with unmatched left rows, which results in a list of accounts_receivable rows that are not associated with any customers row.

customers_not_found = accounts_receivable.join(other = customers, lkey = ["CustNo"], lcolumns = ["Due", "Amount"], rkey = ["CustNo"], join_type = "unmatched")

join() method

Syntax

Parameters

Returns

Examples

Join two dataframes as a way of discovering employees who may also be vendors

Join two dataframes as a way of discovering accounts receivable rows with no matching customer

Page options

Is this page helpful?

Is this page helpful?