在网络数据采集的过程中,图片验证码常常成为爬虫任务的障碍,它是网站用来区分人机的一种手段。然而,随着人工智能技术的不断进步,特别是Google DeepMind发布的Gemini模型,这一挑战得到了有效的解决。Gemini不仅能够理解文本、图像、音频、视频和代码,还能生成高质量的编程代码,大大拓宽了自动化处理的边界。
步骤一:环境搭建与依赖安装
首先,创建一个新项目,并安装必要的依赖,这包括Gemini PHP API客户端。
复制# 创建项目 composer create-project workerman/webman webman20240312 # 安装依赖 composer require google-gemini-php/client
# 创建项目 composer create-project workerman/webman webman20240312 # 安装依赖 composer require google-gemini-php/client
Gemini PHP是一个社区维护的项目,通过它,我们可以轻松地与Gemini AI API进行交互。为了让项目运行顺畅,还需要确保你的PHP版本至少为8.1,并安装了PSR-18兼容的客户端,比如
guzzlehttp/guzzle
。复制composer require guzzlehttp/guzzle
composer require guzzlehttp/guzzle
步骤二:使用Gemini识别文本
接下来,通过Gemini模型测试PHP语言的基本概念,只需简单的代码,即可让Gemini AI模型理解并生成PHP语言的描述。
复制<?php /** * @desc 在PHP中使用谷歌 Gemini 大模型推理识别验证码 */ declare(strict_types=1); require_once '../vendor/autoload.php'; $apiKey = 'AIzaSyAPxxxxxxxxxxxxxxx_uEpw'; $client = \Gemini::client($apiKey); $result = $client->geminiPro()->generateContent('PHP语言是什么?'); echo $result->text() . PHP_EOL;
<?php /** * @desc 在PHP中使用谷歌 Gemini 大模型推理识别验证码 */ declare(strict_types=1); require_once ‘../vendor/autoload.php’; $apiKey = ‘AIzaSyAPxxxxxxxxxxxxxxx_uEpw’; $client = \Gemini::client($apiKey); $result = $client->geminiPro()->generateContent(‘PHP语言是什么?’); echo $result->text() . PHP_EOL;
输出
这一步不仅展示了Gemini的文本处理能力,也为之后的验证码识别做好了准备。
步骤三:验证码识别实践
随后,重点介绍了如何使用Gemini模型识别不同类型的图片验证码。
获取原始文本
复制<?php /** * @desc 在PHP中使用谷歌 Gemini 大模型推理识别验证码 */ declare(strict_types=1); require_once '../vendor/autoload.php'; use Psr\Http\Message\RequestInterface; use Psr\Http\Message\ResponseInterface; $apiKey = 'AIzaSyAPLiuNxxxxxxxxxxxxxxx_uEpw'; $client = \Gemini::factory() ->withApiKey($apiKey) ->withBaseUrl('https://gemini.ailard.com/v1/') ->withHttpClient($client = new \GuzzleHttp\Client([])) ->withStreamHandler(fn(RequestInterface $request): ResponseInterface => $client->send($request, [ 'stream' => true // Allows to provide a custom stream handler for the http client. ])) ->make(); $result = $client ->geminiProVision() ->generateContent([ 'I will provide you with an image CAPTCHA, please recognize the content inside the CAPTCHA and output the text', new \Gemini\Data\Blob( mimeType: \Gemini\Enums\MimeType::IMAGE_JPEG, data: base64_encode( file_get_contents('[图片路径]') ) ) ]); echo $result->text() . PHP_EOL;
<?php /** * @desc 在PHP中使用谷歌 Gemini 大模型推理识别验证码 */ declare(strict_types=1); require_once ‘../vendor/autoload.php’; use Psr\Http\Message\RequestInterface; use Psr\Http\Message\ResponseInterface; $apiKey = ‘AIzaSyAPLiuNxxxxxxxxxxxxxxx_uEpw’; $client = \Gemini::factory() ->withApiKey($apiKey) ->withBaseUrl(‘https://gemini.ailard.com/v1/’) ->withHttpClient($client = new \GuzzleHttp\Client([])) ->withStreamHandler(fn(RequestInterface $request): ResponseInterface => $client->send($request, [ ‘stream’ => true // Allows to provide a custom stream handler for the http client. ])) ->make(); $result = $client ->geminiProVision() ->generateContent([ ‘I will provide you with an image CAPTCHA, please recognize the content inside the CAPTCHA and output the text’, new \Gemini\Data\Blob( mimeType: \Gemini\Enums\MimeType::IMAGE_JPEG, data: base64_encode( file_get_contents(‘[图片路径]’) ) ) ]); echo $result->text() . PHP_EOL;
输出结果
复制The content inside the CAPTCHA is "[原始文本]".
The content inside the CAPTCHA is “[原始文本]”.
获取计算结果
复制<?php /** * @desc 获取验证码图片计算结果 */ declare(strict_types=1); require_once '../vendor/autoload.php'; use Psr\Http\Message\RequestInterface; use Psr\Http\Message\ResponseInterface; $apiKey = 'AIzaSyAPLiuNxxxxxxxxxxxxxxx_uEpw'; $client = \Gemini::factory() ->withApiKey($apiKey) ->withBaseUrl('https://gemini.ailard.com/v1/') ->withHttpClient($client = new \GuzzleHttp\Client([])) ->withStreamHandler(fn(RequestInterface $request): ResponseInterface => $client->send($request, [ 'stream' => true // Allows to provide a custom stream handler for the http client. ])) ->make(); $result = $client ->geminiProVision() ->generateContent([ 'I will provide you with an image CAPTCHA, please recognize the content inside the CAPTCHA and output the text', new \Gemini\Data\Blob( mimeType: \Gemini\Enums\MimeType::IMAGE_PNG, data: base64_encode( file_get_contents('[图片路径]') ) ) ]); echo $result->text() . PHP_EOL;
<?php /** * @desc 获取验证码图片计算结果 */ declare(strict_types=1); require_once ‘../vendor/autoload.php’; use Psr\Http\Message\RequestInterface; use Psr\Http\Message\ResponseInterface; $apiKey = ‘AIzaSyAPLiuNxxxxxxxxxxxxxxx_uEpw’; $client = \Gemini::factory() ->withApiKey($apiKey) ->withBaseUrl(‘https://gemini.ailard.com/v1/’) ->withHttpClient($client = new \GuzzleHttp\Client([])) ->withStreamHandler(fn(RequestInterface $request): ResponseInterface => $client->send($request, [ ‘stream’ => true // Allows to provide a custom stream handler for the http client. ])) ->make(); $result = $client ->geminiProVision() ->generateContent([ ‘I will provide you with an image CAPTCHA, please recognize the content inside the CAPTCHA and output the text’, new \Gemini\Data\Blob( mimeType: \Gemini\Enums\MimeType::IMAGE_PNG, data: base64_encode( file_get_contents(‘[图片路径]’) ) ) ]); echo $result->text() . PHP_EOL;
输出结果
通过这两个实际的验证码案例,展示了从获取验证码图片到使用Gemini模型识别的完整流程。无论是简单的字符验证码,还是需要计算结果的验证码,Gemini都能准确识别,展现了其在图像处理方面的强大能力。
最后
通过上述步骤,我们不仅看到了Gemini AI模型在处理多种类型信息上的优秀表现,也见证了它在自动化爬取任务中,尤其是图片验证码识别方面的实际应用价值。结合PHP语言,我们可以轻松集成Gemini模型,提高数据采集的效率和准确性,为Web开发带来更多的可能性。
阅读全文
温馨提示: