在網路上有很多爬蟲教學,不過我發現很少人再認真討論 BeautifulSoup 這個Library的教學文,所以就來討論這支Library用法
如果你的英文能力夠強可以到 BeautifulSoup官方網站
1.首先導入我們要用的程式library
from bs4 import BeautifulSoup2.把我們要解析的html格式文件準備好
html_doc = """ The Dormouse's storyThe Dormouse's story
Once upon a time there were three little sisters; and their names were Elsie, Lacie and Tillie; and they lived at the bottom of a well.
...
"""3.執行BeautifulSoup的function
soup = BeautifulSoup(html_doc, 'html.parser')4.最後印出我們要的結果出來
print(soup)如果想要呈現正常的html格式的話,可以多加prettify()這個function進來
print(soup.prettify())