KOTA ITO

Generation

generate functionMon, 20 Mar 2023

beautiful soupで、//*[@id="rso"]/div[1]/div/div/div[1]/div/aに含まれるURL文字列を取得する。

from urllib.request import urlopen from bs4 import BeautifulSoup from urllib.error import HTTPError def get_url(url): try: html = urlopen(url) except HTTPError: return None try: bsObj = BeautifulSoup(html.read(), 'lxml') #bsObj = BeautifulSoup(html.read(), 'html.parser') except AttributeError: return None return bsObj.find("div", {"id": "rso"}).find("div", {"class":"g"}).find("div", {"class":"rc"}).find("a").get('href')

Questions about programming?Chat with your personal AI assistant