-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathspacex_falcon9_wiki.py
34 lines (23 loc) · 217 KB
/
spacex_falcon9_wiki.py
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
# -*- coding: utf-8 -*-
"""SpaceX falcon9_wiki.ipynb
Automatically generated by Colab.
Original file is located at
https://colab.research.google.com/drive/1EdP2M9iB4Kw4NfGFE6qxr4QV41FrGTaB
#web scraping on a Wikipedia page listing Falcon 9 and Falcon Heavy rocket launches.
Why use a wiki?
Structured data: Wikipedia pages typically have a well-defined and predictable structure, making data extraction easier.
Up-to-date information: Information on Wikipedia is constantly updated.
Free access: Wikipedia information is freely available.
Next steps and applications:
After this stage, the code is usually extended to extract specific information from the HTML page (e.g., launch date, rocket type, payload). The extracted data is then stored in a data structure like a DataFrame for further analysis.


"""
import requests
from bs4 import BeautifulSoup
import unicodedata
import pandas as pd
Falcon9_WikiURL= "https://en.wikipedia.org/wiki/List_of_Falcon_9_and_Falcon_Heavy_launches"
site= requests.get(Falcon9_WikiURL)
site.status_code
site.text