A Developer’s Roadmap to Getting Started with AI in 2025

Rifx.Online
Programming , Machine Learning , Natural Language Processing
19 Jan, 2025

In my last article, I wrote about a learning path of AI for beginners, in an attempt to demystify its tools and applications for day-to-day tasks.

This time, we’re taking a sharp left, so to speak.

Imagine building, deploying, and even monetizing SaaS applications — entirely on your own but you don’t know where to start. Primarily meant for junior developers looking for a guide or a curriculum, this article walks through some key concepts, tools, and strategies to get you started.

As usual, I have organized the article in large categories and buckets that are not meant to be serially consumed so feel free to browse and skip if you are already familiar with some concepts. However, based on my experience, I would strongly recommend to try and first understand the basic concepts of each category before you attempt to build an entire SaaS app by yourself.

1. Large Language Models (LLMs)

Probably the best starting point for learning about LLMs is the legendary video tutorial by one of the creators of the OpenAI ChatGPT model, Andrej Karpathy —

It is well worth the one hour watch to thoroughly understand not only what are Transformer models but how LLMs are built, should you ever decide to and have the resources to build your own.

LLM API Providers and Features

Once you have a fair understanding of LLMs and their most common use case of conversational information synthesis, you can begin exploring the APIs offered by large LLM providers. Start by experimenting with different APIs to understand their capabilities and potential applications. At the time of writing this article, there are several both commercial and open source model providers with APIs so the list below is not 100% complete it should give you the lay of the land.

Concepts — Most LLM providers have several different models available for APIs for pay-per-use. The models are often categorized under embedding models, completion (text to text), speech to text, text to image etc. etc and often priced by the size of the models as inference with large models use more compute and hence more pricier. In addition, most providers also offer a Playground where you can try out the prompts and results for different models and play around with configurations like temperature etc.

OpenAI: Known for its cutting-edge models like GPT-4, offering robust API integrations. Over the year, it has added a number of constructs that have become more confusing than ever. It has a notion of an org, a project and billing is at each level. You can assign users and keys by the orgs and projects. They also provide the notion of an Assistant API (more like an Agent which we will cover later) and real-time API that allows building model that can take voice and immediately get back results in either audio or text format in a few milliseconds. Keep in mind that as of now, the real-time API is probably the most expensive.

AWS and Google — Both AWS (Bedrock) and Google (Vertex AI) has versions of the following. A model catalog where you can choose your own model, fine tune some of the models, deploy those models and connect them to tools or APIs. AWS Bedrock also has the notion of an Agent and Agent Orchestration that you can use to build autonomous AI apps ala microservices.

I should also mention that Google’s new AI Studio now has new tools that can “see” and “hear” you through the browser (with your permission, of course) so you can build applications around training, or even automating repeated tasks without any complicated automation workflows.

Although AWS does have its own LLMs, it primarily offers Anthropic models that are probably the best in coding tasks. Google released Gemini and Flash models in Dec 2024 that are at the same levels as Claude 3.5 at a much cheaper cost.

Anthropic: Speaking of Claude, one of my favorite LLM providers is Anthropic that has three broad categories of models based on size — Haiku (smallest), Sonnet (mid-sized) and Opus (large). Of all of these Sonnet 3.5 has been the reigning champion when it comes to generation of code. In addition, it recently added two additional features that are ground breaking — computer use tool and Model Context Protocol (MCP) servers. With these APIs, as a developer you can now build applications that can use your computer browser to do tasks on the users behalf.

Hugging Face: This is by far the largest platform when it comes to finding, fine tuning and deploying your own open source models in your own private instances in cost-effective manners. The reason why you may choose to do this is because for certain tasks and use cases you may require a smaller, cheaper and privately hosted models. In open source, the Lllama series models from Meta are by far the best according to industry benchmarks.

Local Models: Finally, I should add that if you are anything like me and find yourself sometimes without an internet, for example a long flight, you should consider running models locally. My two favorites are Ollama and LMStudio. Both allow you to download models locally and run them as a localhost endpoint that you can call from your code just like any LLMs. I should caveat though that in order to run anything over a 32B parameter model, you need a decent laptop with adequate GPUs and memory.

Each provider has unique features tailored to different use cases. Comparing costs, fine-tuning options, and scalability will help you choose the right one for your project.

Function Calling

Modern LLMs are evolving to handle structured API calls directly. This enables seamless automation of tasks like booking appointments, querying databases, or managing workflows. Function calling bridges the gap between AI and traditional APIs, making integrations more intuitive.

Fine-Tuning and Deploying Open-Source LLMs

For developers seeking customization, fine-tuning open-source LLMs is a game-changer. Models like LLaMA, Falcon, and GPT-J empower developers to build domain-specific applications. Tools such as Hugging Face and LangChain simplify the fine-tuning and deployment process, enabling efficient scaling.

2. Agents

What Is an Agent?

One of the features that OpenAI and eventually other model providers released last year was Function Calling. This allowed LLMs to call back a function in an application that gave it access to run code and hence became “agentic.” This feature has now evolved to Tools and you can use multiple tools into one construct and attach a specific LLM and now you have Agents. If you interested in a deeper dive into this, you can read about in the article about the 7 Factor App here -> https://readmedium.com/the-7-factor-enterprise-ai-app-4528d02d0e83

Agent Frameworks

There are several frameworks in both Python and JS/TS that allow you to build Agents through simple dictionary based interfaces. Listed below are few. When choosing one, I would suggest picking something that also has a robust feature around orchestrating the Agents in a deterministic and controllobale way and has trace and debug features.

LangChain and LangGraph: This, by far has a very comprehensive feature set but over a period of time this framework is no longer as simple as it used to be so be ready for a learning curve. However, it has a rich community of users and the documentation is exceptional.
Autogen: Autogen was first released by Microsoft and the first version along with Autogen Studio provided a very compelling platform for building Agents. However, the original founder left Microsoft and now there are two different versions of Autogen which makes thing a little confusing, especially if you are trying to pick something longer term for enterprise AI.

CrewAI: This framework excels in collaborative multi-agent environments but candidly I don’t have much experience with using this. I have heard from other developers that this is a good framework to though.
OpenAI Swarm: This is probably the easiest and simplest way to get started with building simple agents and building graphs/agent workflows. If you are interested in learning more, you can go through the article I wrote about using Swarm for an advanced RAG starter kit here -> https://readmedium.com/how-to-build-a-multi-agent-rag-system-mars-with-openai-swarm-b6eb8a0ffc4a
AWS Bedrock: AWS Bedrock has the notion of building UI based Agents and workflows that is packed with features. However, you need to understand the AWS eco-system including IAM and permissions to get started. If you are comfortable with AWS platform in general, this is the place to start.
Llamaindex: This is another great platform, especially if you are also looking for a common layer to interface with multiple data sources, especially databases like SingleStore that offers both SQL, JSON, Vectors and exact keyword matches, all in one single place.

There are, of course, a number of other frameworks and you should choose based on your needs but this set should get you started to get a broader understanding of this space.

3. Retrieval-Augmented Generation (RAG)

What Is RAG?

RAG combines the power of LLMs with real-time, domain-specific data retrieval to respond to queries or take action based on prompts based on data that the LLM is not “aware” or was trained on. This approach ensures that AI outputs are both accurate and contextually relevant, especially in enterprises where there are vasts amounts of data that LLMs were not used to train on. Use cases include personalized customer support, dynamic content generation, and real-time knowledge retrieval.

All About Vectors and Semantic Searches

At the heart of it, RAG involves searching through both structured (say JSON or SQL data) and unstructured data (like pdf files, images, videos). For unstructured data, you typically break them into chunks with some overlap then convert them into vectors which are basically floating point values of those objects in a multi-dimensional space. For example, “the dog jumped over haystack” may become (0.234, 1.343, 2.343, 1.334….) You typically store this either in memory for certain ephemeral use cases or in a vector database. To search through the vectors, you then convert the query into an embedding/vector first (using an embedding model), then you do a semantic search, for example dot_product to see which objects in the database are similar to the query and then pass that chunk of the data to the LLM as context.

You can also look at a more detailed vector databases comparison I wrote about last year here — https://readmedium.com/the-ultimate-guide-to-vector-databases-2024-and-beyond-16dfb15bef12

Databases and Data Strategies for RAG

As you can imagine, when you start storing a lot of vectors, you need to think about how to store and retrieve them and in enterprises you need to also think about retrieving other kinds of data as well. There are existing vector-only databases both open sourced and commercial like Pinecone, Weaviat, Mivuls etc. but if you are looking for databases that can store and search through all data in single shot queries like SQL, JSON, Vector etc, then consider using databases like SingleStore, Elastic or AWS’s Opensearch etc.

Once you have understood the basics of RAG, you can then explore a few additional topics to take this further.

Advanced RAG features — https://readmedium.com/secrets-to-optimizing-rag-llm-apps-for-better-accuracy-performance-and-lower-cost-da1014127c0a
Multi-Agent RAG Systems -https://readmedium.com/how\-to\-build\-enterprise\-ai\-apps\-with\-multi\-agent\-rag\-systems\-mars\-f922f69f59ba
Knowledge Augmented Retrieval (KAG) — https://github.com/OpenSPG/KAG?tab=readme-ov-file

Let’s now move to the next topic of learning how to use AI for 10X coding, development. Most likely, you are not going to need all the tools, but I have listed them nonetheless so that you can pick and choose for different projects, use cases and requirements.

4. Development/Coding

Evolution of Coding Tools

Believe it or not, in the last two years, AI coding tools have already gone through what seems like 10 years worth of changes. Initially, we had tools like Microsoft Co-pilot that helped with code completions. But now we have moved to two VS Code IDEs that can not only do code completions but you can also chat with them about the code and they also have Agent based interactions that can perform actions on your behalf like creating new files, and running terminal commands including installing new libraries and packaged. These tools are Cursor and Windsurf. Both tools also give you the capabilities of using separate files or choosing entire codebase as context.

I would highly recommend downloading the free versions of both the products and trying them out for different use cases.

I should also mention that Claude with its artifacts is also exceptional in generating code and small applications that you can test withing the browser then bring them into your code base to iterate further. In addition, now both OpenAI as well as Claude also helps creating mermaid based architecture and flow diagram making it easier to visually and iterate on your application.

I should add that there is an emerging trend here of some tools taking on the full role of a junior developer that costs anywhere between $500 per month to $4,000 per month (not a typo). With these tools, you can create features and ask them to build these out and the tools will do Pull Requests and build out entire features (about two to three in a week) and check in the code along with the documentation. These tools include Devin (with an unusual Slack based integration) and Tempo Labs (browser based interface).

5. UI and Developer Tools

If you are building a full-stack application, gone are the days when you needed to rely on visual designers to first build the user interaction then the wireframes and then the screens before you could start coding. If you are looking to independently build out wireframes and screens for your application, here are some tools you should get familiar with and start trying out.

v0.dev — A tool from Vercel, you can input either an image or screenshots or even links to Figma design or provide prompts and this will generate entire screens for you and also provide the code for React components. Better still, you can also choose objects within the design and use v0 to iterate on them and finally get an npx command that you can use to specifically install that specific component in your React project.

Bolt.new — Built by stackblitz, Bolt.new allows you to not only build visual screens but entire applications based on prompt and you can connect the codebase to your repo or download the entire code once you are satisfied with the results and then use it to build additional features.
Lovable.dev — Similar to Bolt.new, this tool for now also gives you the ability to select individual objects within design and iterate on them with prompts.
UI screens and Wireframes — Finally, if you are only looking to generate wireframes and user interaction designs and screens, you can also learn more about and use tools like uizard.io, relume and tempo labs.
In addition to the UI related tools I also find a few other AI related tools that are a time saver. For example, I use Openrouter to create on API which I can use in all places instead of constantly managing OpenAI, Anthropic and other LLM keys. I also use SingleStore’s SqRL bot to generate SQL queries for SingleStore. In addition, I also find myself sometimes using Warp, a Mac based terminal app where you can provide instructions in English and it finds and runs the terminal commands for you.
Lastly, I would be remiss if I did not mention Claude’s Model Context Protocol (MCP) server. You can learn more about it here -https://github.com/modelcontextprotocol/servers
This is an amazing tool to automate your day-to-day tasks through Claude Desktop. If you are looking to learn more about low-code way to automate tasks between different applications through webhooks and other triggers, I would also highly recommend to check out the open source tool n8n — n8n.io that has a rich eco-system of connectors and code snippets to make backend work easier than ever.

Conclusion

In this article we looked at some broad categories of things you can do in a day-to-day developer’s life and learn about some tools and resources to get you started in becoming an AI-first Developer in 2025 and beyond. Please share any resources and tools I may have missed that you find useful in your development workflow.

✌️