I’ve been experimenting quite a bit with coding-focused AI tools lately, and I figured it might be useful to share some personal reflections on what actually works (and what doesn’t) in practice.
One tool I spent time with is Cline, and I actually found it pretty solid. The “plan + ask” mode is especially nice. It’s a good way to get familiar with how large language models (LLMs) think—or rather, how they don’t think. You get used to their strengths, but also the weird mistakes they can make. In that sense, it’s as much a learning tool as a productivity one.
These days, though, I mostly use Claude (Code) for personal projects. With the Pro (and sometimes Max) plan, the value-for-cost ratio is hard to beat for my use case. When it comes to generating code, Claude 4 Opus is, in my opinion, the most effective LLM out there right now. Sonnet isn’t far behind. I’ve had less success with Gemini, DeepSeek, or GPT for code—they tend to produce results that require more debugging on my part.
As for Amazon Q Developer, it seems promising (especially if you’re working heavily with AWS stacks) but I haven’t had a chance to test it extensively yet. That’s on my TODO list as they recently introduced Claude Sonnet 4 in the CLI tool.
🎯 The Sweet Spot: Scope and Hallucinations
I’ve found that LLMs are most useful when the problem isn’t too large. Give them a broad, vague description of what you need, and you’ll likely get hallucinations—or as I sometimes prefer to call it, “creativity.” Interestingly, this isn’t always a bad thing. Sometimes the AI generates features or approaches I hadn’t considered, leading to genuinely useful discoveries.
However, the key is keeping the scope manageable. The moment you try to tackle something complex in one go, the wheels tend to come off. đźš—đź’Ą
❌ Broad prompt (leads to hallucinations):
"Create a Python web scraper that handles any website and stores data in a database"
âś… Targeted prompt (much better results):
"Write a Python function using requests and BeautifulSoup to extract all h2 tags from a single webpage and return them as a list"
The difference is night and day. The broad prompt will give you an over-engineered solution with features you don’t need, potential security issues, and code that probably won’t work on the first try. The targeted one gives you exactly what you asked for.
đź’» Stick to What You Know
I primarily use LLMs with languages I’m already comfortable with: Python and PowerShell. This approach has been crucial to my success with these tools. When the AI makes mistakes (and it will), I can quickly spot them and fix them on the fly. Trying to use LLMs for languages you’re not familiar with is asking for trouble: you won’t know when the code is wrong, incomplete, or following outdated practices.
🎨 Prompt Engineering: Real but Not Reliable
Prompt engineering is definitely a real thing, but in my opinion, it’s far from being deterministic. LLMs have a frustrating tendency to forget instructions partway through a conversation, even when you’ve been very explicit about requirements. You might craft the perfect prompt that works beautifully one time, only to have the same model ignore half your instructions the next time you use it.
This inconsistency means you need to stay actively involved in the process rather than expecting the AI to follow a script perfectly. 🤷‍♂️
đź§© Breaking Down Complex Problems
One takeaway: these tools are still a long way from being able to autonomously solve complex problems. The best approach I’ve found is to break things down into small, simple tasks. The context window size of the model you’re using plays a major role when it comes to solving more involved problems, so don’t underestimate it.
Here’s my general workflow:
- Identify the core problem and break it into logical components
- Start with the simplest piece that provides immediate value
- Test and validate each component before moving to the next
- Iterate and refine based on what actually works
- Integrate components only after each piece is solid
🎯 The Bottom Line
AI coding tools are getting better, but they’re not magic. Used wisely, they’re powerful allies—but you still need to be the one steering the ship. Think of them as very capable junior developers who need clear direction, frequent check-ins, and careful code review.
The real value comes not from expecting them to solve everything, but from learning how to leverage their strengths while compensating for their weaknesses. 🚀