How to deal with SettingWithCopyWarning in Pandas

Introduction

If you have recently upgraded your Pandas library and started seeing the "SettingWithCopyWarning" warning messages, you might be wondering what it means and how to resolve it. In this article, we will explore the cause of this warning and discuss different solutions to handle it.

Understanding the SettingWithCopyWarning

The "SettingWithCopyWarning" in Pandas is a warning message that is triggered when assigning a value to a slice of a dataframe created using chained indexing. Chained indexing refers to using multiple indexing operations (e.g., [] or .loc) one after another.

The warning is raised to notify the user that they might unintentionally modify a copy of the original dataframe instead of modifying the original dataframe itself. This behavior can lead to inconsistent results and is usually unintentional on the part of the user.

Why does the warning occur?

In the given code snippet, the warning is raised because the assignment of quote_df['TVol'] = quote_df['TVol']/TVOL_SCALE is using chained indexing. Since the dataframe quote_df is obtained through multiple indexing operations, Pandas cannot guarantee whether the assignment is modifying the original dataframe or a copy of it.

Recommended Solutions

There are several ways to handle the "SettingWithCopyWarning" in Pandas. Let's discuss each solution in detail:

Use .loc[row_indexer,col_indexer] for assignment


quote_df.loc[:, 'TVol'] = quote_df['TVol']/TVOL_SCALE
quote_df.loc[:, 'TAmt'] = quote_df['TAmt']/TAMT_SCALE
quote_df.loc[:, 'TDate'] = quote_df['TDate'].map(lambda x: x[0:4]+x[5:7]+x[8:10])

By using .loc[row_indexer,col_indexer] instead of chained indexing, we eliminate the ambiguity and ensure that the assignment is made on the original dataframe. This solution explicitly references the dataframe's indexers to modify the desired cells.

Create a copy of the dataframe


quote_df = pd.read_csv(StringIO(str_of_all), sep=',', names=list('ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefg'))
quote_df.rename(columns={'A':'STK', 'B':'TOpen', 'C':'TPCLOSE', 'D':'TPrice', 'E':'THigh', 'F':'TLow', 'I':'TVol', 'J':'TAmt', 'e':'TDate', 'f':'TTime'}, inplace=True)
quote_df = quote_df.ix[:,[0,3,2,1,4,5,8,9,30,31]]
quote_df['TClose'] = quote_df['TPrice']
quote_df['RT']     = 100 * (quote_df['TPrice']/quote_df['TPCLOSE'] - 1)
quote_df['TVol']   = quote_df['TVol']/TVOL_SCALE
quote_df['TAmt']   = quote_df['TAmt']/TAMT_SCALE
quote_df['STK_ID'] = quote_df['STK'].str.slice(13,19)
quote_df['STK_Name'] = quote_df['STK'].str.slice(21,30)
quote_df['TDate']  = quote_df.TDate.map(lambda x: x[0:4]+x[5:7]+x[8:10])

If you want to keep the chained indexing syntax, you can create a copy of the dataframe using the assignment operations on the existing dataframe:


quote_df = quote_df.copy()
quote_df['TVol'] = quote_df['TVol']/TVOL_SCALE
quote_df['TAmt'] = quote_df['TAmt']/TAMT_SCALE
quote_df['TDate'] = quote_df['TDate'].map(lambda x: x[0:4]+x[5:7]+x[8:10])

This solution ensures that you are modifying a new copy of the dataframe instead of the original dataframe.

Suspending the Warning

If you want to suspend the "SettingWithCopyWarning" temporarily and continue with your code execution without seeing the warning messages, you can silence the warnings using the following code snippet:


import warnings
warnings.filterwarnings("ignore")

However, it is recommended to fix the underlying issue causing the warning instead of suppressing the warnings altogether.

Conclusion

The "SettingWithCopyWarning" in Pandas is a warning message that should not be ignored. It signifies potentially incorrect modifications to a copy of a dataframe instead of the original dataframe. By using the recommended solutions mentioned in this article, you can ensure that you are modifying the intended dataframe and avoid unexpected consequences. It is always better to address the root cause of the warning and make the necessary changes in your code to eliminate the warning.