Generation

generate functionFri, 27 Jan 2023

There is a Pandas dataframe user_id name date hotel profit_per_room how_find_us resting_time rating 0 1 Ksenia Rodionova 2021-07-01 Alpina 1639.000000 by_recommendation 48 3.0 1 2 Ulyana Selezneva 2021-07-01 AquaMania 930.000000 by_airbnb.com 97 4.0 2 3 Konstantin Prokhorov 2021-07-01 Breeze 1057.720000 agg_trivago.com 173 4.0 3 4 Petrov Vladimir 2021-07-01 Moreon 1403.000000 agg_onlinetours.ru 229 4.0 4 5 Arina Selivanova 2021-07-01 Alpina 1639.000000 agg_sutochno.ru 63 4.0 We need to find customers who have contacted us several times and make a separate table with them. That is, we need to find duplicate values in the name column and leave only them, all the rest removed. Example: name Ksenia Rodionova Artur Petrov Ivan Sidorov Ksenia Rodionova Result: name Ksenia Rodionova Ksenia Rodionova

# pandas dataframe df = pd.DataFrame({'name': ['Ksenia Rodionova', 'Ulyana Selezneva', 'Konstantin Prokhorov', 'Petrov Vladimir', 'Arina Selivanova', 'Artur Petrov', 'Ivan Sidorov', 'Ksenia Rodionova', 'Ksenia Rodionova'], 'date': ['2021-07-01', '2021-07-01', '2021-07-01', '2021-07-01', '2021-07-01', '2021-07-01', '2021-07-01', '2021-07-01', '2021-07-01']}) df.head(10) df_new = df.loc[df.duplicated(subset=['name'], keep=False), :] df_new

Questions about programming?Chat with your personal AI assistant