记录一把铲子的故事,佬友听我娓娓道来~~

一、干什么


某天 工作 摸鱼时,L瘾犯了,速速打开熟悉的L站,
很多帖子在首页,但没浏览过。

本以为是错过的新活准备愉悦观帖,但居然全是上古帖子,
有这样的:image
居然还有这样的:image
最后回复时间也是几个月前!!!!
谁家好人挖坟啊啊啊啊!!!!

那么,蟹老板也想跟随各位佬的脚步一起玩儿!

二、能干吗


挖坟是否合规?翻了翻,发现始皇曾曰:

奉天承运,始皇诏曰

干是能干,有点难找哪里能干
翻找哪里能挖实在太累了,弄个小工具再干!!!!

三、怎么干


①先找找哪里能看到帖子,发现一个接口长这样:

https://linux.do/latest.json?no_definitions=true&page=${pageNumber}

请求一下有个Json……嗷~~

②响应回来Json有部分是这样的:

ResponseData -> topic_list -> topics
{
        "id": 69665,
        "title": "linux do是不是被墙了",
        "fancy_title": "linux do是不是被墙了",
        "slug": "topic",
        "posts_count": 31,
        "reply_count": 10,
        "highest_post_number": 32,
        "image_url": null,
        "created_at": "2024-04-28T15:27:30.881Z",
        "last_posted_at": "2024-04-29T13:43:41.492Z",
        "bumped": true,
        "bumped_at": "2024-04-29T13:43:41.492Z",
        "archetype": "regular",
        "unseen": false,
        "last_read_post_number": 17,
        "unread": 0,
        "new_posts": 15,
        "unread_posts": 15,
        "pinned": false,
        "unpinned": null,
        "visible": true,
        "closed": false,
        "archived": false,
        "notification_level": 2,
        "bookmarked": false,
        "liked": false,
        "tags": [],
        "tags_descriptions": {

        },
        "views": 891,
        "like_count": 21,
        "has_summary": false,
        "last_poster_username": "yeahow",
        "category_id": 2,
        "pinned_globally": false,
        "featured_link": null,
        "has_accepted_answer": false,
        "can_have_answer": false,
        "can_vote": false,
        "posters": [
          {
            "extras": null,
            "description": "原始发帖人",
            "user_id": 19906,
            "primary_group_id": null,
            "flair_group_id": 13
          },
          {
            "extras": null,
            "description": "频繁发帖人",
            "user_id": 1,
            "primary_group_id": null,
            "flair_group_id": 1
          },
          {
            "extras": null,
            "description": "频繁发帖人",
            "user_id": 1414,
            "primary_group_id": null,
            "flair_group_id": null
          },
          {
            "extras": null,
            "description": "频繁发帖人",
            "user_id": 3418,
            "primary_group_id": null,
            "flair_group_id": 13
          },
          {
            "extras": "latest",
            "description": "最新发帖人",
            "user_id": 21176,
            "primary_group_id": null,
            "flair_group_id": 13
          }
}

那么这个bumped_at…… 嗷~~

③那筛选时间在几个月前的topic不就……嗷~~

四、想好就干,大干特干


但是老蟹只会写HelloWorld级c艹代码,硬着头皮干了:

①用什么库:

先发送请求,curl

#include <curl/curl.h>

然后解析响应,用libjsoncpp

#include "jsoncpp/json/json.h"

记录一下结果吧,直接建个文件夹扔csv里得了

②干:

大概如下:
当请求到一定页数之后没数据啦,那工作就完成了,所以弄个变量存一下:

bool FinalStatus = false;

同时判断一下时间,老蟹觉得三个月就算挖坟咧:
( 这是尊贵的gpt大人写的 )

  bool timeDiff(const std::string &isoTimeStr) {
      std::tm tm = {};
      std::istringstream ss(isoTimeStr);
      ss >> std::get_time(&tm, "%Y-%m-%dT%H:%M:%S.%fZ");
      auto givenTime = std::chrono::system_clock::from_time_t(std::mktime(&tm));
      auto now = std::chrono::system_clock::now();
      auto duration = std::chrono::duration_cast<std::chrono::hours>(now - givenTime).count();
      return (duration / 24.) >= (3 * 30.44);  // 平均每月天数约30.44天
  }

写入csv,先新建一个csv:

{
    mode_t mode = 0755;
    time_t now = time(0);
    tm *tc = localtime(&now);
    std::string currtime = "-" + std::to_string(tc->tm_year + 1900) + std::to_string(tc->tm_mon + 1) +
                           std::to_string(tc->tm_mday) + '-' + std::to_string(tc->tm_hour) + '-' +
                           std::to_string(tc->tm_min);
    std::string dir = "../LinuxDoTopicLists";
    int mdret = ::mkdir(dir.c_str(), mode);
    std::string filename = "LinuxDo-Pages-Topics" + currtime;
    filename = dir + "/" + filename + ".csv";
    fs.open(filename, std::ios::in | std::ios::out | std::ios::trunc);
    if (fs.is_open()) {
        fs << "url,topic\n";
    } else {
            FinalStatus = true;
    }
}

然后写入数据,主要记录以url形式记录id与标题名儿:

  void topicWrite(int topic_id, const std::string &topic) {
      if (fs.is_open()) {
          fs << "https://linux.do/t/topic/" << topic_id << "," << topic << std::endl;
      } else {
          std::cout << "file open error" << std::endl;
      }
  }

万事具备,请求一下:

  static size_t WriteCallback(void *contents, size_t size, size_t nmemb, std::string *userp) {
      size_t totalSize = size * nmemb;
      userp->append((char *)contents, totalSize);
      return totalSize;
  }

  bool getLinuxdoPage(const std::string &url) {
      std::string readBuffer;
      if (curl) {
          curl_easy_setopt(curl, CURLOPT_URL, url.c_str());
          curl_easy_setopt(curl, CURLOPT_WRITEFUNCTION, WriteCallback);
          curl_easy_setopt(curl, CURLOPT_WRITEDATA, &readBuffer);
      }
      CURLcode CurlNode = curl_easy_perform(curl);
      if (CurlNode != CURLE_OK) return false;
      return parseJson(readBuffer);
  }

这儿筛选一下然后存起来:

  bool parseJson(const std::string &readBuffer) {
        Json::Value data;
        Json::Reader reader;
        if (!reader.parse(readBuffer, data)) return false;
        if (data["topic_list"]["topics"].empty()) {
            FinalStatus = true;
            return false;
        }
  
        // data["topic_list"]["topics"][99]["bumped_at"]
        for (Json::Value &temp : data["topic_list"]["topics"]) {
            if (timeDiff(temp["bumped_at"].asString())) {
                topicWrite(temp["id"].asInt(), temp["fancy_title"].asString());
            }
        }
        return true;
    }

OK,再整理整理就可以了,嗷~~

③附一份结果,佬友们有空可以去挖~~~

没有添加token好像只能返回前1000页,玩一玩也够了!!
附一份结果,也就1 2 3 4 5 ……一共9270条帖子:
LinuxDo-Pages-Topics-202481-16-35.csv (690.2 KB)

五、佬们怎么干

有佬指点一下萌新的c艹代码吗?怎么写更好一点?
或者,有没有更妙的的什么办法?
还有KFCVME50,谁请我吃??

8 Likes

你,很刑

3 Likes

我愿称你为金铲铲大师

2 Likes

代码挖坟,很刑

2 Likes

细说

1 Like

挖挖你的

1 Like

传说级挖坟工具 :hedgehog:

1 Like

可恶,怎么你也白嫖我的小尾巴创意! :tieba_066:

1 Like

这个资本家是我的好友,天天让我干一些违法勾当 :face_with_peeking_eye:

1 Like

还是我聪明,直接看最后一行,毕竟不想学技术

1 Like

不是,佬们住在L站???这么快!!!

1 Like

这把洛阳铲终究还是来到了L站?

1 Like

怎么叫白嫖嘛!

你这真是big胆啊

牛哇牛挖

始皇说可挖!!

我建议你住进去

好好好,纯金洛阳铲了都还有kfc

不要这样子

给我点赞就不算你白嫖 :tieba_003:

1 Like