Key Responsibilities:
Design, develop, and deploy NLP systems using advanced LLM architectures (e.g., GPT, BERT, LLaMA, Mistral) tailored for real-world applications such as chatbots, document summarization, Q&A systems, and more.
Implement and optimize RAG pipelines, combining LLMs with vector search engines (e.g., FAISS, Weaviate, Pinecone) to create context-aware, knowledge-grounded responses.
Integrate external knowledge sources, including databases, APIs, and document repositories, to enrich language models with real-time or domain-specific information.
Fine-tune and evaluate pre-trained LLMs, leveraging techniques like prompt engineering, LoRA, PEFT, and transfer learning to customize model behavior.
Collaborate with data engineers and MLOps teams to ensure scalable deployment and monitoring of AI services in cloud environments (e.g., AWS, GCP, Azure).
Build robust APIs and backend services to serve NLP/RAG models efficiently and securely.
Conduct rigorous performance evaluation and model validation, including accuracy, latency, bias/fairness, and explainability (XAI).
Stay current with advancements in AI research, particularly in generative AI, retrieval systems, prompt tuning, and hybrid modeling strategies.
Participate in code reviews, documentation, and cross-functional team planning to ensure clean and maintainable code.
if you are looking to explore new opportunity feel free to share your resume on .