Generation

generate functionThu, 30 Mar 2023

Crea una funcion que reciba 2 dataframes de pyspark, df_properties con las columnas property_id, site, plan_codename, url, info_date y df_leads con las columnas columnas pl_received_at, property_id y portal_name entre otras. Cree una columna nueva llamada number_leads en df_properties donde ponga cuantos registros existen en la tabla df_leads por cada fecha info_date de df_properties. La busqueda debe ser comparando df_properies.info_date == df_leads.pl_received_at, df_properies.site== df_leads.portal_name y df_properies.property_id_leads== df_leads.property_id_leads

def join_df(df_properties, df_leads): df_properties = df_properties.withColumn("number_leads", count(df_leads.pl_received_at == df_properties.info_date & df_leads.portal_name == df_properties.site & df_leads.property_id == df_properties.property_id)) return df_properties

Questions about programming?Chat with your personal AI assistant