Generation

generate functionTue, 02 May 2023

Extracting text from HTML file

import urllib.request from bs4 import BeautifulSoup url = "https://en.wikipedia.org/wiki/List_of_state_and_union_territory_capitals_in_India" html = urllib.request.urlopen(url) soup = BeautifulSoup(html, 'html.parser') for tr in soup.find_all('table'): for td in tr.find_all('td'): text = td.find(text=True) print(text)

Questions about programming?Chat with your personal AI assistant