Gen AI
Published:
I am going to post here about my thoughts on deep learning and AI, with regards to the recent development of large language models (LLMs) and world models, and the impact on the real world.
I am building LLMs at Amazon Rufus under Amazon Search, where a shopping assistant will be powered by LLMs. First of all, the interaction between the users and the services will be through natural language with the advent of LLMs, therefore, search will be more interactive and personalized. LLMs will also enable the assistant to understand the users’ needs and provide more personalized recommendations.
I am working on evaluating the LLMs and LLM alignment. Evals are always a challenge in all real world applications, since it is hard to cover all the cases that are important to the real world applications and nontrivial to the real use cases. LLMs are new types of systems, and it is probabilistic in nature compared to rule based systems. In some way, if you know how to comprehensively evaluate the LLMs, you can train a good LLM for that use case.
Through many approaches, we could align the LLMs to be helpful, harmless, and honest. The classical methods are from OpenAI InstrctGPT, where SFT and RLHF are used to train the model. More recently, there are also some new methods that are used, such as rejection sampling and DPO, which are much simpler to implement than PPO in RLHF.
