External Thought Manipulation Boosts Efficiency in Large Language Models

Top post
Increasing Efficiency in Large Language Models Through External Thought Manipulation
Large Reasoning Models (LRMs) have made impressive progress in various fields in recent years. Their ability to solve complex tasks and generate human-like text has made them a central component of modern AI applications. However, LRMs often struggle with the problem of "overthinking": They produce an excessive number of reasoning steps that contribute only marginally to performance improvement while simultaneously increasing computational costs.
Previous approaches to reducing this overthinking have mainly focused on fine-tuning the models. This, however, requires additional training data, specialized training facilities, and carries the risk of safety and alignment issues as well as limited generalizability. A new research approach proposes a different path: manipulating the model's thought processes through external influences.
ThoughtMani: A New Approach to Efficiency Improvement
A recent study investigates the possibility of influencing the thought processes of LRMs through the strategic placement of external "thoughts." These external thoughts, called Chain-of-Thought (CoT) prompts, are generated by smaller, more efficient models and inserted between special "thought tokens" (
ThoughtMani is based on the observation that LRMs react to the information placed between the thought tokens and integrate it into their own thought processes. By providing pre-fabricated CoTs, the LRM's thought process can be guided and thus made more efficient. This reduces the number of tokens generated and thus the computational costs, without affecting the accuracy of the results.
Experimental Results and Advantages of ThoughtMani
In experiments with the QwQ-32B model on the LiveBench/Code dataset, ThoughtMani was able to reduce the number of generated output tokens by about 30% while maintaining the original performance. The additional computational effort for generating the external CoTs by the smaller model is comparatively low. Furthermore, it was shown that ThoughtMani improves the safety alignment of the model by an average of 10%.
The advantages of ThoughtMani are manifold:
- Reduction of computational costs and energy consumption - Improvement of the efficiency of LRMs - Increased safety alignment - Simple implementation without complex fine-tuning - Potential for improved scalability and accessibility of LRMsOutlook and Significance for AI Development
ThoughtMani offers a promising method for optimizing LRMs and could drive the development of more efficient and accessible AI systems. Since many AI providers offer models of different sizes in parallel, ThoughtMani allows the use of smaller, more cost-effective models to control the thought processes of larger models. This opens up new possibilities for the use of LRMs in real-world applications, especially in areas with limited computing resources.
Research in this area is still in its early stages, and further investigation is necessary to fully exploit the potential of ThoughtMani and similar approaches. However, the results so far suggest that manipulating the thought processes of LRMs through external influences is a promising way to improve the efficiency and safety of AI systems. This is particularly relevant for companies like Mindverse, which specialize in the development and deployment of customized AI solutions, including chatbots, voicebots, AI search engines, and knowledge systems. By integrating techniques like ThoughtMani, these systems can be operated more efficiently and cost-effectively, which will further promote their dissemination and application in various industries.
Bibliography: Liu, Y., Zheng, J., Sun, Z., Peng, Z., Dong, W., Sha, Z., Cui, S., Wang, W., & He, X. (2025). Thought Manipulation: External Thought Can Be Efficient for Large Reasoning Models. arXiv preprint arXiv:2504.13626. Li, Z., et al. (2025). Exploring the Limits of Large Language Models. arXiv preprint arXiv:2503.24370. The Promise of Reasoning Models. Epoch AI Gradient Updates. Wei, J., et al. (2024). Towards Large Reasoning Models: A Survey of Reinforced Reasoning with Large Language Models. ResearchGate. Su, J., et al. (2022). Reasoning with Language Model Prompting: A Survey. NeurIPS. Huang, W., et al. (2024). Chain-of-Thought Prompting Elicits Reasoning in Large Language Models. AAAI. Xu, M., et al. (2025). Show Your Work: Scratchpads for Intermediate Computation with Language Models. OpenReview. Liu, Y., et al. (2025). Thought Manipulation: External Thought Can Be Efficient for Large Reasoning Models. Hugging Face. Xu, M., et al. (2025). Is Your Model Thinking Clearly? Improving Chain of Thought Prompting with a Cleaned Scratchpad. OpenReview. Zhang, Y., et al. (2025). A Survey of Efficient Reasoning for Large Reasoning Models: Language, Multimodality, and Beyond. ResearchGate.