使用 MySQL 存储数据

MySQL 是一个免费的、开源的关系数据库管理系统(RDBMS)。 在此示例中,我们将从网站读取行星数据并将其存储到 MySQL 数据库中。

准备工作

您将需要有权访问 MySQL 数据库。 您可以在容器内安装本地安装的云中的容器。 我正在使用本地安装的 MySQL 服务器,并将 root 密码设置为 mypassword。 您还需要安装 MySQL python 库。 您可以使用 pip install mysql-connector-python 来完成此操作。

  1. 首先要做的是在终端使用 mysql 命令连接到数据库:

    # mysql -uroot -pmypassword
    mysql: [Warning] Using a password on the command line interface can
    be insecure.
    Welcome to the MySQL monitor. Commands end with ; or \g.
    Your MySQL connection id is 4
    Server version: 5.7.19 MySQL Community Server (GPL)
    Copyright (c) 2000, 2017, Oracle and/or its affiliates. All rights reserved.
    
    Oracle is a registered trademark of Oracle Corporation and/or its affiliates. Other names may be trademarks of their respective owners.
    
    Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.
    
    mysql>
  2. 现在我们可以创建一个数据库来存储我们抓取的信息:

    mysql> create database scraping;
    Query OK, 1 row affected (0.00 sec)
  3. 现在使用新数据库:

    mysql> use scraping;
    Database changed
  4. 并在数据库中创建一个 Planets 表来存储我们的数据:

    mysql> CREATE TABLE `scraping`.`planets` (
    `id` INT NOT NULL AUTO_INCREMENT,
    `name` VARCHAR(45) NOT NULL,
    `mass` FLOAT NOT NULL,
    `radius` FLOAT NOT NULL,
    `description` VARCHAR(5000) NULL,
    PRIMARY KEY (`id`));
    Query OK, 0 rows affected (0.02 sec)

现在我们准备抓取数据并将其放入 MySQL 数据库中。

怎么做

  1. 以下代码(见 03/store_in_mysql.py)将读取行星的数据并将其写入 MySQL:

    import mysql.connector
    import get_planet_data
    from mysql.connector import errorcode
    from get_planet_data import get_planet_data
    
    try:
        # open the database connection
        cnx = mysql.connector.connect(user='root', password='mypassword',
                                      host="127.0.0.1", database="scraping")
    
        insert_sql = ("INSERT INTO Planets (Name, Mass, Radius, Description) " +
                      "VALUES (%(Name)s, %(Mass)s, %(Radius)s, %(Description)s)")
    
        # get the planet data
        planet_data = get_planet_data()
    
        # loop through all planets executing INSERT for each with the cursor
        cursor = cnx.cursor()
        for planet in planet_data:
            print("Storing data for %s" % (planet["Name"]))
            cursor.execute(insert_sql, planet)
    
        # commit the new records
        cnx.commit()
    
        # close the cursor and connection
        cursor.close()
        cnx.close()
    
    except mysql.connector.Error as err:
        if err.errno == errorcode.ER_ACCESS_DENIED_ERROR:
            print("Something is wrong with your user name or password")
        elif err.errno == errorcode.ER_BAD_DB_ERROR:
            print("Database does not exist")
        else:
            print(err)
    else:
        cnx.close()
  2. 这会产生以下输出:

    Storing data for Mercury
    Storing data for Venus
    Storing data for Earth
    Storing data for Mars
    Storing data for Jupiter
    Storing data for Saturn
    Storing data for Uranus
    Storing data for Neptune
    Storing data for Pluto
  3. 使用 MySQL Workbench 我们可以看到记录已写入数据库(您也可以使用 mysql 命令行):

    image 2024 01 29 12 54 37 136
  4. 以下代码可用于检索数据(03/read_from_mysql.py):

    import mysql.connector
    from mysql.connector import errorcode
    
    try:
    	cnx = mysql.connector.connect(user='root', password='mypassword',
    								  host="127.0.0.1", database="scraping")
    	cursor = cnx.cursor(dictionary=False)
    
    	cursor.execute("SELECT * FROM scraping.Planets")
    	for row in cursor:
    		print(row)
    
    	# close the cursor and connection
    	cursor.close()
    	cnx.close()
    
    except mysql.connector.Error as err:
    	if err.errno == errorcode.ER_ACCESS_DENIED_ERROR:
    		print("Something is wrong with your user name or password")
    	elif err.errno == errorcode.ER_BAD_DB_ERROR:
    		print("Database does not exist")
    	else:
    		print(err)
    finally:
    	cnx.close()
  5. 这会产生以下输出:

    (1, 'Mercury', 0.33, 4879.0, 'Named Mercurius by the Romans because
    it appears to move so swiftly.',
    'https://en.wikipedia.org/wiki/Mercury_(planet)')
    (2, 'Venus', 4.87, 12104.0, 'Roman name for the goddess of love.
    This planet was considered to be the brightest and most beautiful
    planet or star in the heavens. Other civilizations have named it
    for their god or goddess of love/war.',
    'https://en.wikipedia.org/wiki/Venus')
    (3, 'Earth', 5.97, 12756.0, "The name Earth comes from the IndoEuropean base 'er,'which produced the Germanic noun 'ertho,' and
    ultimately German 'erde,' Dutch 'aarde,' Scandinavian 'jord,' and
    English 'earth.' Related forms include Greek 'eraze,' meaning 'on
    the ground,' and Welsh 'erw,' meaning 'a piece of land.'",
    'https://en.wikipedia.org/wiki/Earth')
    (4, 'Mars', 0.642, 6792.0, 'Named by the Romans for their god of
    war because of its red, bloodlike color. Other civilizations also
    named this planet from this attribute; for example, the Egyptians
    named it "Her Desher," meaning "the red one."',
    'https://en.wikipedia.org/wiki/Mars')
    (5, 'Jupiter', 1898.0, 142984.0, 'The largest and most massive of
    the planets was named Zeus by the Greeks and Jupiter by the Romans;
    he was the most important deity in both pantheons.',
    'https://en.wikipedia.org/wiki/Jupiter')
    (6, 'Saturn', 568.0, 120536.0, 'Roman name for the Greek Cronos,
    father of Zeus/Jupiter. Other civilizations have given different
    names to Saturn, which is the farthest planet from Earth that can
    be observed by the naked human eye. Most of its satellites were
    named for Titans who, according to Greek mythology, were brothers
    and sisters of Saturn.', 'https://en.wikipedia.org/wiki/Saturn')
    (7, 'Uranus', 86.8, 51118.0, 'Several astronomers, including
    Flamsteed and Le Monnier, had observed Uranus earlier but had
    recorded it as a fixed star. Herschel tried unsuccessfully to name
    his discovery "Georgian Sidus" after George III; the planet was
    named by Johann Bode in 1781 after the ancient Greek deity of the
    sky Uranus, the father of Kronos (Saturn) and grandfather of Zeus
    (Jupiter).', 'https://en.wikipedia.org/wiki/Uranus')
    (8, 'Neptune', 102.0, 49528.0, 'Neptune was "predicted" by John
    Couch Adams and Urbain Le Verrier who, independently, were able to
    account for the irregularities in the motion of Uranus by correctly
    predicting the orbital elements of a trans- Uranian body. Using the
    predicted parameters of Le Verrier (Adams never published his
    predictions), Johann Galle observed the planet in 1846. Galle
    wanted to name the planet for Le Verrier, but that was not
    acceptable to the international astronomical community. Instead,
    this planet is named for the Roman god of the sea.',
    'https://en.wikipedia.org/wiki/Neptune')
    (9, 'Pluto', 0.0146, 2370.0, 'Pluto was discovered at Lowell
    Observatory in Flagstaff, AZ during a systematic search for a
    trans-Neptune planet predicted by Percival Lowell and William H.
    Pickering. Named after the Roman god of the underworld who was able
    to render himself invisible.',
    'https://en.wikipedia.org/wiki/Pluto')

工作原理

使用 mysql.connector 访问 MySQL 数据库涉及到使用库中的两个类:connect 和 cursor。 connect 类打开并管理与数据库服务器的连接。 从该连接对象中,我们可以创建一个光标对象。 该游标用于使用 SQL 语句读取和写入数据。

在第一个示例中,我们使用游标将九条记录插入数据库。 在调用连接的 commit() 方法之前,这些记录不会写入数据库。 这会将所有行写入数据库。

读取数据使用类似的模型,只不过我们使用游标执行 SQL 查询 (SELECT) 并迭代检索到的行。 由于我们正在读取而不是写入,因此无需在连接上调用 commit()。

还有更多

您可以从以下位置了解有关 MySQL 的更多信息并安装它: https://dev.mysql.com/doc/refman/5.7/en/installing.html 。 有关 MySQL Workbench 的信息,请访问: https://dev.mysql.com/doc/workbench/en/