基于Python的本地天气数据分析与可视化工具（数据分析与可视化 + 网络请求与接口调用）

背景介绍

在日常生活中，天气信息对我们的出行、工作和生活安排至关重要。然而，许多在线天气工具依赖网络连接，且可能涉及用户隐私问题。为了实现更灵活、安全的本地化天气数据处理，我们可以开发一个基于Python的本地天气数据分析与可视化工具。

该工具通过Python从中国天气网抓取指定城市的实时天气数据，包括气温、湿度、风速、风向、空气质量等信息，并将其保存为CSV文件，最后使用Matplotlib进行可视化分析，生成温度变化曲线、风向雷达图、空气质量柱状图等图表。整个流程完全在本地完成，不依赖云端服务，具备良好的隐私保护和独立运行能力。

通过这个项目，开发者可以掌握Python在网络请求、HTML解析、数据处理、可视化绘图等方面的核心技能，非常适合中级以下开发者进行实践学习。

思路分析

1. 数据获取：网络请求与HTML解析

使用requests库发送HTTP请求获取中国天气网的HTML页面内容。
使用BeautifulSoup解析HTML，提取所需天气数据。
通过正则表达式提取JSON格式的天气数据。

2. 数据处理：结构化与存储

使用pandas将提取的数据转换为DataFrame格式。
将数据保存为CSV文件，便于后续分析与可视化。

3. 数据可视化：图表生成

使用matplotlib绘制温度变化曲线，展示温度随时间的变化趋势。
绘制风向雷达图，展示风向分布。
绘制空气质量柱状图，直观显示空气质量指数。

4. 模块化设计与异常处理

将数据抓取、处理、可视化等功能模块化，提高代码可维护性。
添加异常处理机制，确保程序在遇到无效城市代码、网络错误等情况下能给出明确提示。

代码实现

1. 数据抓取与处理（`weather_scraper.py`）

import os
import requests
from bs4 import BeautifulSoup
import pandas as pd
import json
import re

def get_html_text(url):
    """请求获得网页内容"""
    try:
        r = requests.get(url, timeout=30)
        r.raise_for_status()
        r.encoding = r.apparent_encoding
        print("成功访问")
        return r.text
    except Exception as e:
        print(f"访问错误：{e}")
        return ""

def get_weather_data(city_code):
    """抓取指定城市7天和14天天气数据"""
    base_url_7d = f"http://www.weather.com.cn/weather/{city_code}.shtml"
    base_url_15d = f"http://www.weather.com.cn/weather15/{city_code}.shtml"

    # 获取7天数据
    html_7d = get_html_text(base_url_7d)
    if not html_7d:
        return None

    soup = BeautifulSoup(html_7d, "html.parser")
    script = soup.find("script", text=re.compile(r"var data"))
    if not script:
        return None

    script_text = script.string
    data_match = re.search(r"var data = ({.*?});", script_text, re.DOTALL)
    if not data_match:
        return None

    data_str = data_match.group(1)
    data_json = json.loads(data_str)

    # 提取7天天气数据
    weather_data = []
    for day in data_json["od"]["od2"]:
        if day["od21"] == "": continue  # 跳过无效数据
        row = {
            "时间": day["od21"],
            "温度": day["od22"],
            "风向": day["od24"],
            "风级": day["od25"],
            "湿度": day["od27"],
            "空气质量": day["od28"],
        }
        weather_data.append(row)

    # 获取14天数据（可选）
    html_15d = get_html_text(base_url_15d)
    if html_15d:
        soup_15d = BeautifulSoup(html_15d, "html.parser")
        forecast_div = soup_15d.find("div", id="7d")
        if forecast_div:
            ul = forecast_div.find("ul")
            if ul:
                lis = ul.find_all("li")
                for li in lis:
                    h1 = li.find("h1")
                    if h1:
                        date = h1.string.strip()
                        date = date[:date.find("日")]  # 只取日期
                        p_tags = li.find_all("p")
                        if len(p_tags) >= 2:
                            weather = p_tags[0].string.strip()
                            temp = p_tags[1].string.strip()
                            wind = p_tags[2].find_all("span")
                            wind_dir = wind[0]["title"] if len(wind) > 0 else ""
                            wind_level = wind[1]["title"] if len(wind) > 1 else ""
                            row = {
                                "时间": date,
                                "天气": weather,
                                "温度": temp,
                                "风向": wind_dir,
                                "风级": wind_level,
                            }
                            weather_data.append(row)

    return weather_data

def save_to_csv(data, filename="weather_data.csv"):
    """将天气数据保存为CSV文件"""
    if not data:
        print("没有数据可保存。")
        return
    df = pd.DataFrame(data)
    df.to_csv(filename, index=False, encoding="utf-8-sig")
    print(f"数据已保存至：{filename}")

def main():
    city_code = input("请输入城市代码（例如101280701代表北京）：").strip()
    if not city_code.isdigit():
        print("错误：请输入有效的城市代码。")
        return

    weather_data = get_weather_data(city_code)
    if not weather_data:
        print("错误：未能获取天气数据，请检查城市代码是否正确。")
        return

    save_to_csv(weather_data)

if __name__ == "__main__":
    main()

2. 数据可视化（`plot_generator.py`）

import pandas as pd
import matplotlib.pyplot as plt
import numpy as np
import os

def plot_temperature(data, output_path="temperature_plot.png"):
    """绘制温度变化曲线"""
    plt.figure(figsize=(10, 5))
    plt.plot(data["时间"], data["温度"], marker="o", label="温度")
    plt.xlabel("时间")
    plt.ylabel("温度（℃）")
    plt.title("温度变化曲线")
    plt.xticks(rotation=45)
    plt.legend()
    plt.tight_layout()
    plt.savefig(output_path)
    print(f"温度变化曲线已保存至：{output_path}")

def plot_wind_direction(data, output_path="wind_direction_plot.png"):
    """绘制风向雷达图"""
    wind_directions = ["北", "东北", "东", "东南", "南", "西南", "西", "西北"]
    wind_counts = [0] * len(wind_directions)
    for dir in data["风向"]:
        for i, d in enumerate(wind_directions):
            if d in dir:
                wind_counts[i] += 1
                break

    angles = np.linspace(0, 2 * np.pi, len(wind_directions), endpoint=False).tolist()
    wind_counts += wind_counts[:1]
    angles += angles[:1]

    fig, ax = plt.subplots(subplot_kw={'projection': 'polar'})
    ax.fill(angles, wind_counts, color='skyblue', alpha=0.5)
    ax.set_xticks(angles[:-1])
    ax.set_xticklabels(wind_directions)
    ax.set_title("风向分布雷达图")
    plt.savefig(output_path)
    print(f"风向雷达图已保存至：{output_path}")

def plot_air_quality(data, output_path="air_quality_plot.png"):
    """绘制空气质量柱状图"""
    plt.figure(figsize=(10, 5))
    plt.bar(data["时间"], data["空气质量"], color="green")
    plt.xlabel("时间")
    plt.ylabel("空气质量指数")
    plt.title("空气质量柱状图")
    plt.xticks(rotation=45)
    plt.tight_layout()
    plt.savefig(output_path)
    print(f"空气质量柱状图已保存至：{output_path}")

def main():
    if not os.path.exists("weather_data.csv"):
        print("错误：未找到天气数据文件，请先运行天气数据抓取程序。")
        return

    data = pd.read_csv("weather_data.csv")
    plot_temperature(data)
    plot_wind_direction(data)
    plot_air_quality(data)

if __name__ == "__main__":
    main()

项目结构说明

weather_analyzer/
│
├── weather_scraper.py        # 数据抓取与存储
├── plot_generator.py         # 数据可视化
├── requirements.txt          # 依赖库清单
├── weather_data.csv          # 保存抓取的天气数据
├── temperature_plot.png      # 温度变化曲线
├── wind_direction_plot.png   # 风向雷达图
└── air_quality_plot.png      # 空气质量柱状图

依赖库说明

requests: 用于发送HTTP请求，获取网页内容。
beautifulsoup4: 用于解析HTML内容。
pandas: 用于数据处理与CSV文件存储。
matplotlib: 用于生成数据图表。

安装依赖：

pip install requests beautifulsoup4 pandas matplotlib

示例运行流程

运行数据抓取程序：

python weather_scraper.py
请输入城市代码（例如101280701代表北京）：101280701
成功访问
数据已保存至：./weather_data.csv

运行数据可视化程序：

python plot_generator.py
温度变化曲线已保存至：temperature_plot.png
风向雷达图已保存至：wind_direction_plot.png
空气质量柱状图已保存至：air_quality_plot.png

学习价值

网络请求与接口调用：学习如何使用requests获取网页数据，理解HTTP请求与响应机制。
HTML解析与数据提取：掌握BeautifulSoup和正则表达式提取网页结构化数据的方法。
数据处理与存储：学习使用pandas进行数据清洗与存储。
数据可视化：掌握使用matplotlib绘制多种图表（折线图、雷达图、柱状图）。
模块化设计与异常处理：理解如何构建健壮的Python脚本，提升代码可维护性。

扩展建议（可选）

支持城市名称输入：通过城市名称自动匹配城市代码。
增加更多图表类型：如风速变化曲线、湿度与温度关系图。
添加图形界面：使用tkinter或PyQt实现GUI交互。
多线程抓取：支持同时抓取多个城市的数据。
数据预测功能：集成简单的时间序列模型（如线性回归）进行天气趋势预测。

预计开发时间

核心功能：2~3天
优化与扩展：1~2天

总结

本项目是一个结合网络请求、数据处理与可视化的小型天气分析工具，适合中级以下开发者进行实践学习。通过本项目，开发者可以掌握Python在实际数据获取与分析中的应用，提升在数据科学与Python开发方面的实战能力。无论是作为学习项目还是个人工具，它都能帮助用户更好地理解天气数据的处理与展示方式。

AI管家

基于Python的本地天气数据分析与可视化工具（数据分析与可视化 + 网络请求与接口调用）

背景介绍

思路分析

1. 数据获取：网络请求与HTML解析

2. 数据处理：结构化与存储

3. 数据可视化：图表生成

4. 模块化设计与异常处理

代码实现

1. 数据抓取与处理（`weather_scraper.py`）

2. 数据可视化（`plot_generator.py`）

项目结构说明

依赖库说明

示例运行流程

学习价值

扩展建议（可选）

预计开发时间

总结

发表回复取消回复

基于Python的本地天气数据分析与可视化工具（数据分析与可视化 + 网络请求与接口调用）

背景介绍

思路分析

1. 数据获取：网络请求与HTML解析

2. 数据处理：结构化与存储

3. 数据可视化：图表生成

4. 模块化设计与异常处理

代码实现

1. 数据抓取与处理（weather_scraper.py）

2. 数据可视化（plot_generator.py）

项目结构说明

依赖库说明

示例运行流程

学习价值

扩展建议（可选）

预计开发时间

总结

发表回复 取消回复

1. 数据抓取与处理（`weather_scraper.py`）

2. 数据可视化（`plot_generator.py`）

发表回复取消回复