Key Responsibilities:
- Design, develop, and deploy AI/ML microservices using Python in a cloud-native environment.
- Build scalable pipelines for training, tuning, and serving ML models in production.
- Integrate and manage NoSQL databases (e.g., MongoDB, ElasticSearch) for efficient storage and retrieval of unstructured or time-series data.
- Optimize model accuracy, latency, and throughput, including hyperparameter tuning, feature engineering, and profiling model performance.
- Lead the development of ML-based solutions for anomaly detection, time series forecasting, and predictive analytics.
- Collaborate with cross-functional teams to translate product requirements into ML-based features.
- Apply best practices in model versioning, A/B testing, and continuous training/validation.
- Ensure high standards of code quality, modularity, and observability in deployed services.
- Evaluate new tools, technologies, and frameworks for ML lifecycle management and monitoring.
Required Skills:
- 5+ years of experience in backend software development using Python.
- Strong programming expertise in Python, with proficiency in ML libraries such as scikit-learn, TensorFlow, PyTorch, XGBoost, or LightGBM.
- Demonstrated experience in implementing anomaly detection algorithms and time series forecasting models (e.g., ARIMA, Prophet, LSTM).
- Good understanding of AI/ML algorithms like Random Forest, K-Means, Autoencoders, Graph Neural Networks (GNNs), and Louvain for anomaly detection, clustering, and time-series analysis.
- Experience in building and deploying cloud-native microservices (e.g., on AWS, Azure, GCP).
- Solid understanding of NoSQL databases like MongoDB, ElasticSearch storing ML data and time series.
- Messaging bus like Kafka or RabbitMQ
- Hands-on experience with model performance tuning, evaluation metrics, and real-world ML system optimization.
- Familiarity with ML lifecycle tools (e.g., MLflow, Kubeflow, SageMaker, Vertex AI).
- Understanding of containerization and orchestration (Docker, Kubernetes) for scalable deployment.
- Proficient in working with Git, CI/CD workflows, and Agile development methodologies.
- Experience with CI/CD tools such as Jenkins, GitLab CI, or GitHub Actions.
- Familiarity with Agile methodologies and ticketing systems (JIRA).
Nice to Have:
- Experience applying ML to wireless network optimization.
- Familiarity with federated learning or edge AI techniques to enable distributed ML across radio/access nodes.
- Understanding of online learning or reinforcement learning for dynamic network adaptation and control.