AI-Driven Book Recommendation App with Natural Language User Input

This project is a book recommendation service that suggests books based on a user's inputted genre and book titles. It's built upon a database of 7000 books retrieved from Kaggle. Using openn AI as the large language model, vector embeddings were created with the Kaggle dataset to allow for quick vector search to find semantically similar books through natural language input.

Tool/Framework/Service

Software Dev:
- NextJS, ReactJS
- Tailwindcss
- Vercel Server

Large Language Model:
- Vector Similar Search
- Text Gneration
- Open AI API, Cohere AI API
- Weaviate Vectorization Database

Key Points

Input genre and book titles to get book recommendations with AI

The OpenAI text embedding model vectorizes book descriptions and user preferences, enabling accurate searches for matching books and improving user experience over traditional book recommendations ystems

LLM related Service and Data Pipeline Building

Not only implement pipeline in web dev environment, but also create Python workflow to configure, access and manage vector embeddings in Vectorization Database

Minimalistic experimental interface

This project is experimental and technology-focused, so I streamlined the interface to prioritize and deliver the core functionalities.

Key Features.
This process is designed to be simple and user-friendly: users enter natural language to describe their book preferences and interests, set the number of top search results they'd like to see, and view AI-generated explanations for each book recommendation
One.
Natural language Search
User A often feels confused by the artistic nature of book titles when searching for books, frequently turning to Google for recommendations from others. Now, with BookBuddy, he can simply describe the type of content he enjoys—no matter how detailed—and receive tailored suggestions effortlessly.
Two.
Book Recommendation Reason
User A wants to understand why a particular book might be of interest to him. BookBuddy provides a clear explanation of the possible reasons, helping him gain a better understanding of the book's key aspects in the process.
Three.
Book Details and Purchase Link
User A found his ideal book from the list of recommendations and decided to purchase it online. BookBuddy thoughtfully provides a convenient Amazon purchase link to make the process seamless.
LLM Technical Details
I use diagrams to illustrate the principles of RAG (Retrieval-Augmented Generation) and provide a step-by-step explanation of how I applied RAG and Vector Similarity Search to construct the application's data pipeline. Sample code snippets are included for clarity, and the complete code is also available in BookBuddy's GitHub repository.
One.
Retrieval-Augmented Generation
Retrieval augmented generation is a powerful technique that retrieves relevant data to provide to large language models (LLMs) as context, along with the task prompt. It is also called RAG, generative search, or in-context learning in some cases.

The first step is to retrieve relevant data through a query. Then, in the second step, the LLM is prompted with a combination of the retrieve data with a user-provided query.This provides in-context learning for the LLM, which causes it to use the relevant and up-to-date data rather than rely on recall from its training, or even worse, hallucinated outputs.

In bookbuddy I used Weaviate's integration with Cohere's APIs allows to access AI models' capabilities directly from Weaviate to reduce the development complexity.

Using RAG, BookBuddy generates recommendation reasons by analyzing the information from the books with the highest similarity scores based on the results of the vector similarity search.

- Configure a weaviate vector database collection to use Open AI for textual embeddings
- Weaviate will perform a search, retrieve the most relevant objects, and then pass them to the Cohere provided generative model access to generate outputs
.
Three.
7KBook Dataset from Kaggle
The dataset used for embeddings was sourced from Kaggle and includes 7,000 books. Each entry contains 12 data fields, such as title, author, description, published year, rating, thumbnail (display book cover),ISBN-10 (book identifier- utilized to build url to get book purchase url in Amazon), ISBN-13 (book identifier), and more.
Four.
Configure and Search Data Pipeline
In this project, I utilized Python in conjunction with Weaviate's API to configure the vector database, which serves as the foundation for subsequent application development. Additionally, I implemented a Python-based search script leveraging this database to simulate user search behavior. This approach ensures the seamless operation of the entire data pipeline, mitigating potential risks during the later stages of web development.
Click to see NextJS API Sample to implement Vector Similar Search Query
One.
Natural language to spatial relationship Bubble diagrams Generation using LLM
Client’s needs are often descriptive, and it is difficult for them to convert such needs into relevance to the room layout. The basic knowledge of the big prediction model can initially achieve this and convert it into a bubble diagram, which is conducive to the initial consensus between customers and designers.
click to check diagram
Three.
User Profile Settings
Client A thinks he has more specific requirements for the number and details of different rooms. He defines his user profile by setting their key requirements, thus generating bubble diagrams more accurately.
Key Solution.
The NeuralRoom, a 3D web app that Utilizing natural language processing (NLP) to and Graph Neural Network(GNN) to To improve the efficiency of 3D scene design iteration and discussion between designers and non-professional customers by Integration Large Language Model and graph theory