抓取微信公众号文章(抓取网页)
好的,以下是一个简单的用php实现抓取微信公众号文字和图片的代码示例:
```php
<?php
// 设置请求头
$headers = [
'User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.110 Safari/537.3',
'Referer: https://weixin.sogou.com/',
];
// 获取文章链接
$url = 'https://mp.weixin.qq.com/s/xxxxxxxxxxxxx';
// 发送请求
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_HTTPHEADER, $headers);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
$content = curl_exec($ch);
curl_close($ch);
// 解析HTML内容
$doc = new DOMDocument();
@$doc->loadHTML($content);
// 获取文章标题
$title = $doc->getElementsByTagName('title')->item(0)->nodeValue;
// 获取文章内容
$elements = $doc->getElementsByTagName('div');
$content = '';
for ($i = 0; $i < $elements->length; $i++) {
$node = $elements->item($i);
if ($node->getAttribute('class') == 'rich_media_content') {
// 获取文字内容
$ps = $node->getElementsByTagName('p');
foreach ($ps as $p) {
$content .= $p->nodeValue . "
";
}
// 获取图片链接
$imgs = $node->getElementsByTagName('img');
foreach ($imgs as $img) {
$content .= $img->getAttribute('data-src') . "
";
}
break;
}
}
// 输出结果
echo "文章标题: " . $title . "
";
echo "文章内容: " . $content . "
特别提示:本信息由相关用户自行提供,真实性未证实,仅供参考。请谨慎采用,风险自负。