Generation

generate functionWed, 25 Jan 2023

There is a Pandas dataframe news_title 0 /world/ 1 /latest/ 2 /?updated=top 3 /politics/36188461-s-marta-zhizn-rossiyan-susc... 4 /world/36007585-tramp-pridumal-kak-reshit-ukra... 5 /science/36157853-nasa-sobiraet-ekstrennuyu-pr... 6 /video/36001498-poyavilis-pervye-podrobnosti-g... 7 /world/36007585-tramp-pridumal-kak-reshit-ukra... 8 /science/ 9 /sport/ Filter out this dataframe and leave only the url's with the news sctructure (containing 8 digits and heading) in it, using the str.contains method

import pandas as pd df = pd.read_csv('news.csv', delimiter='\t') def filter_news(news_title): if news_title.startswith('/') and news_title.count('/')==2 and news_title[-1].isdigit() and not news_title.endswith('/'): return True else: return False df.news_title.apply(filter_news)

Questions about programming?Chat with your personal AI assistant