利用Python爬虫速卖通按关键字搜索AliExpress商品

2026/5/4 9:23:24 来源：https://blog.csdn.net/2401_87849308/article/details/144723207 浏览: 次关键词：利用Python爬虫速卖通按关键字搜索AliExpress商品

在当今互联网时代，数据的价值不言而喻，尤其是在电子商务领域。对于从事市场研究、数据分析或者个人项目开发的人士来说，能够从电商平台如速卖通（AliExpress）获取商品数据是一项非常有用的技能。Python以其简洁明了的语法和强大的库支持，成为了编写爬虫的首选语言之一。本文将详细介绍如何使用Python编写一个简单的爬虫程序，以关键字搜索的方式从AliExpress获取商品信息。

环境准备

在开始编写代码之前，我们需要做一些准备工作：

Python环境：确保你的开发环境中已经安装了Python。
pip：Python的包管理工具，用于安装第三方库。
Requests库：用于发送HTTP请求，获取网页内容。
BeautifulSoup库：用于解析HTML和XML文档，提取所需的数据。

安装依赖

使用pip安装Requests和BeautifulSoup：

pip install requests beautifulsoup4

编写爬虫代码

1. 发送HTTP请求

首先，我们需要使用Requests库来发送HTTP请求，获取速卖通的商品搜索页面。

import requestsdef search_products(keyword):url = f"https://www.aliexpress.com/wholesale?SearchText={keyword}"headers = {'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.110 Safari/537.3'}response = requests.get(url, headers=headers)return response.text

2. 解析HTML内容

获取到HTML内容后，我们使用BeautifulSoup库来解析HTML，提取商品信息。

from bs4 import BeautifulSoupdef parse_products(html):soup = BeautifulSoup(html, 'html.parser')products = soup.find_all('div', class_='product-item-info')product_list = []for product in products:title = product.find('a', class_='product-name').text.strip()price = product.find('span', class_='price').text.strip()product_list.append({'title': title, 'price': price})return product_list

3. 整合代码并运行

最后，我们将上述代码整合，并运行爬虫程序。

def main():keyword = "example keyword"  # 替换为你的搜索关键字html = search_products(keyword)products = parse_products(html)for product in products:print(f"Title: {product['title']}, Price: {product['price']}")if __name__ == "__main__":main()