Generation

generate functionMon, 13 Feb 2023

There is df2 Pandas dataframe: surgery age rectal_temp pulse respiratory_rate extremities_temp pain outcome 4 2.0 1 NaN NaN NaN 2.0 2.0 1.0 6 1.0 1 NaN 60.0 NaN 3.0 NaN 2.0 7 2.0 1 NaN 80.0 36.0 3.0 4.0 3.0 15 1.0 9 NaN 128.0 36.0 3.0 4.0 2.0 27 1.0 1 NaN NaN NaN NaN NaN 2.0 33 1.0 1 NaN 100.0 30.0 3.0 5.0 1.0 34 2.0 1 NaN 104.0 24.0 4.0 4.0 3.0 39 1.0 1 NaN 88.0 NaN 3.0 5.0 2.0 Write a function that: 1) takes as a parameter the column name (default = 'rectal_temp') 2) calculate the correlation of this column with other columns: pulse, respiratory_rate, extremities_temp, pain_outcome 3) fill all NaNs in the 'rectal_temp' column with the average values multiplied by the correlation weight

def fill_na_by_corr(df, col_name='rectal_temp'): df_temp = df[df.columns[df.isna().any()].tolist()].drop(['outcome'], axis = 1) df_temp = df_temp.dropna(subset = [col_name]) cor = df_temp.corr()[[col_name]].drop([col_name]) for i in cor.index: if cor.loc[i, col_name] > 0: df[i] = df[i].fillna(value = df[i].mean() * cor.loc[i, col_name]) else: df[i] = df[i].fillna(value = df[i].mean() / cor.loc[i, col_name]) return df

Questions about programming?Chat with your personal AI assistant