人类与AI:无法超越的“做不到”奇点

也许我只是菌子吃多了……

13 Likes

来看看转载了什么

感觉有点问题,但又看不出来w

还可以啊

这样的文字很多 , 类似的观点 都难理清,不过个人认为 不如读读 R1 的技术报告, Incentivizing Reasoning Capability in LLMs via Reinforcement Learning

Aha Moment of DeepSeek-R1-Zero
A particularly intriguing phenomenon observed during the training of DeepSeek-R1-Zero is the occurrence of an “aha moment”. This moment, as illustrated in Table 3, occurs in an intermediate version of the model. During this phase, DeepSeek-R1-Zero learns to allocate more thinking time to a problem by reevaluating its initial approach. This behavior is not only a testament to the model’s growing reasoning abilities but also a captivating example of how reinforcement learning can lead to unexpected and sophisticated outcomes.
This moment is not only an “aha moment” for the model but also for the researchers observing its behavior. It underscores the power and beauty of reinforcement learning: rather than explicitly teaching the model on how to solve a problem, we simply provide it with the right incentives, and it autonomously develops advanced problem-solving strategies. The “aha moment” serves as a powerful reminder of the potential of RL to unlock new levels of intelligence in artificial systems, paving the way for more autonomous and adaptive models in the future.

可能也是这篇文章,最近让 Altman 破防,最终反思可能自己站在了错误的一面,因为以后人类提起 RL 对自然语言大模型的转折这一点,只会记住 deepseek 的开源,同时鄙视其他的封闭。

1 Like

这是你写的啊?感觉有点混乱